Search Optimization in the Distributed Networks.

S. Osokine.
osokin@osokin.com
15 Oct 2002.

Abstract

Superpeers (also sometimes called ultrapeers) are generally considered to be a promising approach to content search improvement in the distributed networks. This study analyzes the performance of several superpeer architectures from the practical standpoint, comparing the different architectures and superpeer role assignment algorithms when the goal is to to maximize the search query reach in a peer-to-peer network. The study also investigates the performance of the 'Local Indices' mutual index caching architecture from the same standpoint, and suggests an optimal caching approach that tries to maximize the peer-to-peer search performance by distributing the superpeer functionality across all network nodes. This approach can significantly improve the query reach in comparison with the superpeer architectures, since it allows all network hosts to maximize their contribution to the overall network resource pool, potentially approaching the theoretical limits of the peer-to-peer network search performance. The study shows that the search performance of this hybrid approach is directly tied to the average host bandwidth and to its hardware (RAM and CPU) resources, as opposed to the normal superpeer case, when the search performance is tied to the bandwidth and resources of the 'best' network hosts. Thus the suggested algorithm does not require careful superpeer selection and does not present the high-profile superpeer targets to the attacker, which makes it less fragile than the conventional superpeer architectures.

Contents

1. Introduction
2. Basic Assumptions
     2.1. Full query message utilization
     2.2. Infinite network size
     2.3. Reach maximization for all queries
     2.4. Full host bandwidth utilization
     2.5. Host behaviour model
     2.6. Infinite CPU speed
3. Flat Network Architectures
     3.1. Identical-host symmetric link network
     3.2. Identical-host asymmetric link network
     3.3. Heterogeneous symmetric link network
     3.4. Arbitrary bandwidth case
4. Superpeer Architectures
     4.1. Plain superpeer network
     4.2. Plain superpeer network: explanations
     4.3. Non-routing superpeers
     4.4. Redundant superpeer clusters
     4.5. Optimal redundancy level
     4.6. Superpeer network problems
5. Mutual Index Caching
     5.1. Mutual index caching performance
     5.2. Optimal search conditions
     5.3. Index transfer bandwidth
     5.4. Host search value maximization
     5.5. Asymmetric link case
     5.6. Decentralized network control
     5.7. Suboptimal decentralized control
6. Conclusion
     6.1. Acknowledgements
7. References
Appendix A. Superpeer query failure probability
Appendix B. 'Napster meltdown' and why Gnutella failed to die
Appendix C. Index transfer traffic averaging


1. Introduction.

   Peer-to-peer networks gain more and more popularity on the Internet. It is difficult to provide the precise numbers - in part due to their explosive growth - but it would be fair to say that at the time of this writing (October, 2002) the total number of P2P network users is probably well over one hundred million, and the number of simultaneously connected hosts in such networks as Gnutella [1] and Kazaa [2] is measured in millions.

   One of the most important components of any peer-to-peer system is its search engine, and many projects are being actively developed in this area. The goal of the search engine is to discover the resources (files, hosts, Web services, etc) scattered over the net and to present the list of these resources to the requestor. All search approaches can be roughly divided in two groups:

   The first group includes the approaches that seek to route the search queries to the nodes that are more likely to satisfy these queries. This group includes such projects as CAN [3], Pastry [4], Chord [5], Tapestry [6], PNRP [7], and other similar projects. Typically the projects in this group seek to design the search mechanisms with per-request load proportional to the logarithms of the number of nodes in the distributed network and/or of the number of content items in it. The goal here is to make sure that every item of content is guaranteed to be located within a bounded number of operations (request forwarding hops, comparisons, etc).

   This guaranteed search comes for a price - the 'foreign data' storage requirements may have to be forced on the hosts, the request has to specify the precise identifier of the data item and so on.

   The second group of the search algorithms includes the ones that require the search request propagation to the number of nodes comparable to the number of nodes in the network for the search to be guaranteed. Since this is often not achievable in practice, these algorithms typically do not guarantee the search success, but rather try to achieve the most extensive search given the current network hardware resources and architecture.

   This group includes Napster [8], Gnutella [1], Kazaa [2] and other similar systems. It might seem strange that a centralized Napster search is paired with pure peer-to-peer Gnutella one. After all, Napster seems to symbolize the centrally controlled peer-to-peer systems, which use the central server to organize the behaviour of the clients. Gnutella, on the contrary, is a fully decentralized self-organizing network without any central control points, which is cooperatively built by the independent interacting networked software agents. An attempt to control the Gnutella network is not unlike an attempt to control the growth of the coral reef.

   But here we are comparing the search mechanisms, not the control architectures. And Napster - exactly as Gnutella - can find the file only after looking through the local index of the host that stores that file. The fact that this index is physically stored on one of the Napster's central servers does make the search more effective; nevertheless, Napster is constrained by its central servers' capacity, and has to split the network into several semi-independent 'slices' in order to carry that load.

   Of course, the descriptions above invite the question: who would want to use the non-guaranteed search systems when the guaranteed ones are available? Well - as luck would have it, in the past years the guaranteed search systems were being adopted much slower and currently have several orders of magnitude smaller user base than the non-guaranteed search ones. There are many reasons for that - for example, the non-guaranteed search peer-to-peer architectures are much simpler to implement, allow the 'fuzzy' search queries ('brit*.mp3', etc), and do not unduly burden the user, not requiring him to dedicate part of his hard disk to store items that he has no personal interest in. However, all these attractive features would not be able to ensure such a huge adoption rate advantage over the guaranteed-search systems, if the non-guaranteed search architectures would not be able to satisfy the average user of a peer-to-peer system - which they obviously do.

   It is not a coincidence that all three of the non-guaranteed search systems above have (or had, in case of Napster, before it was shut down by the court) tens of millions of users. Apparently they satisfy their users despite all their disadvantages - which are significant.

   As a matter of fact, these disadvantages become more pronounced as the peer-to-peer networks grow. Once upon a time, it was possible to query every host in the Gnutella network in search of some special file. Then the query was able to reach about half of all hosts. Then twenty percent. Now the same query would be lucky to reach a single-digit percentage of all network hosts, which makes a search for the rare file a tedious and time-consuming lottery. Kazaa is doing better in that respect, but its query seems to be unable to reach all the network hosts, too. All this makes the attempts to widen the query reach to the biggest possible number of hosts a very active research field today - even when the search is not guaranteed, increasing its probability of success is very important for the users' satisfaction and for peer-to-peer system competitiveness and survival.

   And one of the most widespread approaches to this problem is to use the superpeers (or ultrapeers) - the well-connected hosts that act as the concentrators, holding the content indices and routing the queries on behalf of the 'leaf hosts'. In this scenario, the ad hoc peer-to-peer network becomes semi-structured, with a two-level topology, as opposed to the flat topology of the early peer-to-peer systems.

   In fact, it is easy to see that the Napster search architecture can be viewed as a special case of the superpeer search architecture. From the search mechanism standpoint the Napster central directory servers are equivalent to a network with a few very large superpeers that do not forward the queries to each other (due to the resource constraints or for some other reason) and perform all the searches locally - only among the leaf nodes of the same superpeer. (Of course, from most of the other standpoints the Napster server is very different from a decentralized network superpeer - it is a highly specialized central control application that has little in common with the independently interacting superpeer agents.)

   It has long been an article of faith among the peer-to-peer developers that superpeers should be somehow able to solve the request reach problem, dramatically increasing the chances of success for the search. But only recently the serious research is starting to be performed in this area. The study [9], for example, is answering the questions about the best practices in superpeer architecture, superpeer performance and about the ways to minimize the load on the superpeers, allowing them to route more data on behalf of the users.

   At the same time, the same researchers were investigating the alternative approaches to the query reach optimization, trying to quantatively analyze the peer-to-peer network performance in case of the different routing and index storage approaches [10]. Unfortunately for the practical peer-to-peer network developers, thorough and solid studies [9] and [10] seek to maximize the search efficiency - essentially the authors freeze the search request reach or a number of returned results at some fixed value and try to minimize the system load under these conditions.

   This serves a purpose related, but not exactly equivalent to maximizing the query reach. In practical peer-to-peer system design the system load does not necessarily have to be minimized. In fact, as long as the individual host load does not exceed certain thresholds, it might be beneficial to increase it, as long as this increases the request reach and search success rate, which is the main goal of the practical search engine design. This study tries to do exactly that, applying the reasoning similar to [9] and [10] in order to maximize the query reach and search success rate.

(Query reach and search success rate criteria are not exactly equivalent, especially when the 'Local Indices' mutual index caching [10] is used. When the hosts cache the multiple indices on behalf of other hosts, even a query that reaches a single host can have a significant probability of success. However, it is often convenient to measure both query reach and search success rate in 'hosts'. Then a query reach of N hosts means that a query returns the number of results equal to the one that would be returned if the same query would be directly applied to N average P2P hosts, even though in reality, for example, it might have been applied to a much smaller set of superpeers. In that case 'search success' and 'query reach' optimization criteria become synonymous.)

   Of course, saying that efficiency and request reach criteria are different might sound strange, since the architectures with lower system loads should also result in a wider request reach. Generally this is really the case. However, in superpeer architectures the superpeers can become the performance bottlenecks, whereas the 'leaf' nodes would be mostly idle. So unloading the well-connected superpeers and loading the low-bandwidth hosts might lead to the dramatic search performance improvements, even though both aggregate and per-request system loads might be increased in the process. Also, since the architectural design recommendations are necessarily 'fuzzy' and combine a great deal of theoretical research and experimental data into a few seemingly simple (and sometimes even counterintuitive) recommendations, changing the optimization criteria can lead to the dramatically different architectural recommendations, as it will be shown below in Section 5.

2. Basic Assumptions.

   This section establishes several basic assumptions to be used throughout the rest of this article. These assumptions simplify the distributed system analysis and may or may not be true in a real peer-to-peer system, depending on its scale, purpose, and design. However, the changes brought by abandoning these assumptions are not of the extremely dramatic nature, and typically just would require including some limited-range coefficients in the analytical formulae below.

   It is important to note that even though the assumptions below are illustrated by the Gnutella network examples, these assumptions are of the general nature - this study tries to rely as little as possible on the query message delivery mechanisms specific to any particular distributed search network. Generally we try to assume only that the query is somehow delivered to a set of remote hosts that query their local databases and somehow return the results to the querying host. This might be illustrated by Figure 1:

Figure 1. Query message delivery to the remote hosts.

- here the mechanism of the message delivery might be a Gnutella [1] breadth-first traversal (BFS), random walkers [11], LimeWire GUESS proposal [12] or some other method. We are mostly concerned with the limits imposed on this delivery by the network resource constraints (bandwidth and CPU of the network hosts).

2.1. Full query message utilization.

   First, we assume that every query message transmitted over the wire is not dropped or ignored by the receiving host. In Gnutela context, this assumption is equivalent to the zero redundancy in the query propagation graph, or to the infinitely large network with the random connection establishment algorithm.

   The statement above deserves some explanation. Typically the Gnutella network topology graph is presented by pictures similar to Figure 2 in [13], or Figure 2 below:

Figure 2. Gnutella network graph.

   The pictures like this one are useful for visualizing the ad hoc topology of the Gnutella network, but they do create a flawed perception of the Gnutella network redundancy. For example, if the host g decides to query such a network, it sends the query to its neighbours d and h, which retransmit it to their neighbours a, e, b, c, f, and i. Host e receives the same query from both d and h. Since Gnutella queries have unique IDs, e answers just one of these queries, and ignores the duplicate, but half of its incoming bandwidth is wasted on the transmission of the redundant query. The situation becomes even worse on the next hop - by now, all hosts have already seen this query, but will receive multiple copies of it anyway, as, for example, b, e, and f will transmit the redundant information to c.

   This example creates an impression that 'looping' (redundant query transmission) is a serious problem for the Gnutella network [14]. Fortunately, this is not exactly the case. When the new hosts are connecting to the Gnutella network, they select the connection points from the very large lists of already connected hosts (or hosts known to be connected in the past). Further, the whole subject of the query reach increase came to light when the network became too large for a single query to reach every other host on the network. So the probablity of these randomly selected connection points being close in the 'hop space' is very low, and keeps becoming lower and lower as the network keeps growing.

   As a result, from the single host viewpoint the Gnutella network topology is closer not to Figure 2, but to Figure 3, or to the graphs presented on Figure 13 in [15] and on the GnuMap project page [16]:

Figure 3. Single-host view of the Gnutella network graph.

   Of course, some redundancy and looping are always present, but their influence on the aggregate traffic calculations is not very noticeable in the large network. Essentially, for the query propagation purposes the Gnutella network can be viewed as a tree with the root at the query-originating host. If this assumption will not happen to be true for some peer-to-peer network, then either the traffic statistics calculation will have to be adjusted to allow for the bandwidth wasted on looping, or the redundancy will have to be eliminated, either using methods similar to the ones suggested in [14], or (as suggested in [17]) using the well-known cycle avoidance techniques [18].

2.2. Infinite network size.

   This assumption is closely related to the previous one - it is much simpler to analyze the query propagation if it is not expected to reach all network hosts, since we don't have to consider the special case of the full query reach and cap all the formulae accordingly. This is also required for the previous assumption to be true.

   This assumption is really just a convenience - if it is not true, we already have the full network coverage by a single query, and further optimization is pointless. The inevitable (in this case) redundancy will not have a detrimental effect on the query reach, even though it might result in a sub-optimal bandwidth utilization.

2.3. Reach maximization for all queries.

   Many recent search optimization proposals ([10], [11], [12], [17], [19]) are seeking to reduce the number of nodes that process an average query without degrading the search quality. In many cases these approaches utilize the huge disparity in popularity of the different content items. The popular files in the file-sharing system can be present on a significant percentage of the network hosts, whereas rare files can be present only on a few hosts. Since the most frequent queries are likely to look for the most poplular files, a significant resource economy can be achieved if the search can be terminated immediately after reaching a certain number of positive matches.

   This approach is valid and interesting, but for the purposes of this study we assume that such an optimization is not used, and every search reaches the same average number of storage hosts. Setting aside the analysis simplification benefits when such an assumption is used, there are two additional reasons to use this approach:

   So here we'll keep in mind that there are some very valuable optimizational possibilities in this approach (it is certainly tempting to be able to answer 92-93 percents of the queries locally from a 10,000-leaf superpeer), but we'll restrict ourselves to the scenario when a superpeer does have to forward the incoming queries to the widest possible range of the other superpeers. Especially since this 10,000-leaf superpeer in [20] was a Napster 'slice' server that obviously requires a lot of resources - the savings in the more realistic scenario of 400-leaf superpeer are significantly lower.

   And of course, the numbers above reflect only the queries with no results at all - realistically, the usable peer-to-peer file-sharing system will be much better off if it will try to return not just one, but many results for every query. This is necessary for both multi-source downloads, which are becoming the standard feature of any successful file-sharing system, and for having an ability to select the proper answer from the results matching a non-precise query. These factors make it necessary to require many answers for a query, so even the superpeers with some results found might have to forward the query to other superpeers anyway, which will obviously make the savings less impressive (see Appendix A.)

2.4. Full host bandwidth utilization.

   As opposed to the studies [9] and [10], which freeze the expected number of the returned results and seek to minimize the bandwidth utilization, here we assume that we need to maximize the expected number of query results, and that the bandwidth of the peer-to-peer network hosts is going to be the major bottleneck that prevents the full network querying by a single search.

   So this study assumes that the peers' bandwidth is saturated by the transmission of requests. For a large-scale peer-to-peer network, this assumption is quite realistic. It has been noted long ago that the traffic in the Gnutella network tends to saturate the dial-up modem links ([21], [22]). This fact even prompted predictions about the eventual overload and death of the Gnutella network [23].

   Of course, nothing like this ever happened, and Appendix B explains in more detail why exactly it didn't. What is important for us here, is that the Gnutella queries' propagation is not stopped by their TTL (as a surprising number of people still think), but by the flow control mechanisms that selectively drop the queries if and when the physical network links start to get overloaded.

   This assumption is just an analysis guideline; in practice, there will be multiple scenarious when the host bandwidth won't be saturated:

2.5. Host behaviour model.

   Throughout this study, we are using certain statistical assumptions about the storage and querying behaviour of the average host in the peer-to-peer network. The assumptions used here reflect the expected behaviour of a host in a generic file-sharing network, and can be different for a real file-sharing application. Still, we tried to set the realistic expectations and use the values that would reflect the typical behaviour of the host in the existing file-sharing networks.

   The primary values used in the subsequent sections are the average host storage size , session length , interval between queries , query message size , file index record size , and i-th host bandwidth . Unless specifically stated otherwise, we'll be using the following values for these parameters:

files, seconds, seconds, bytes, bytes. (1)

and

bytes/sec for the modem connection. (2)

   These assumptions mean that the average host stores 200 files with an index record (name, author and other searchable metadata) for every file taking 100 bytes. The host connects to the peer-to-peer network through a modem connection capable of transmitting 5000 bytes of search requests per second, or 50 queries per second, since the average query size is 100 bytes. On the average, the host spends one hour at a time online, and while it is online, it issues a file search query (100 bytes) every 200 seconds - which is equivalent to the QueryJoinRatio of 18 (as defined in [10]).

   Even though it was not our intention to find the exceedingly accurate values for these parameters, they do roughly correspond to the values observed in the existing peer-to-peer networks. These numbers are loosely based on the existing studies of the peer-to-peer networks ([10], [20], [26]), on the observations of the average per-client storage reported by the Kazaa [2] network client in October of 2002 (which reported approximately 180 stored files per client) and on the estimates and discussions in the Gnutella [1] community about the typical Gnutella client behaviour.

   Since the great accuracy was not required, some numbers were rounded and/or changed to the values provided by less precise, but more recent estimates. For example, the query message size of 100 bytes is used instead of the 112-byte estimate from [26]. The 100-byte index record (metadata) size of the file is used instead of thed 208-byte value from [26] and 72-byte - from [9]. The average number of 200 files stored by a host is taken instead of the non-precise Kazaa client measurements (180), and 168 and 340 suggested in [26] and [10] correspondingly. The one-hour session length and 200-second interval between queries are not substantiated by any published research and are basically the 'educated guesses', based on the community discussions and observations. In any case, these two last numbers are most likely to change, since they are based on such application-dependent technical decisions as the re-request frequency (to find more download sources, for example) and on whether the application is automatically started in the user's machine boot sequence or not.

   The modem bandwidth (2) roughly corresponds to the 56-kbit modem throughput after the factors listed in Section 2.4 are taken into account. When the high-bandwidth connections will appear in our study, their bandwidth will be similarly estimated without any great precision - just to give us some rough numbers to work with.

2.6. Infinite CPU speed.

   Unlike [9], [10], [20], and [26], for the purposes of this study we assume that the host CPU is infinitely fast and is not a bottleneck in any of the peer-to-peer host operations. This approach is a direct consequence of the assumption made in Section 2.4 - that typically the network link capacity is a performance bottleneck.

   Of course, as it was already mentioned, this might not be true for the well-connected hosts. However, the majority of the hosts in the network like Gnutella [1] are not CPU-bound. This is why to simplify the matter, we simulate the CPU overload by applying an effective bandwidth cap to the overloaded network hosts.

   To roughly estimate this cap value, we use the CPU costs of the atomic actions provided in [9]. This data suggests that for the query size (1) the CPU cost of receiving one query, processing it, and maybe retransmitting one copy of the query, will be about 2M clock cycles on the Pentium III 930-MHz processor. This translates into approximately 500 queries per second limit, or the effective bandwidth cap of 50 kilobytes per second. After applying the frequently used in practice '10 bits per byte' estimate, which we'll be using throughout this study, this is equivalent to the bandwidth of about 500 kilobits per second.

   This bandwidth cap value is based on several assumptions:

   In any case, since some or all of these assumptions might be wrong, in this study we won't be using a strict 500-kilobits or 1-megabit per second cap, but rather investigate the range of the values likely to exist in the particular peer-to-peer network architecture.

   Now let's apply these assumptions to the performance analysis of various peer-to-peer network architectures.

3. Flat Network Architectures.

   In this section we'll estimate the performance of the 'flat' peer-to-peer network architectures, where all hosts are responsible only for their own data and index storage and treat the network traffic in a uniform fashion according to their capacities. No superpeers are present to store the indices for other, 'leaf' hosts and to route the query traffic on their behalf.

   This analysis is needed for two purposes: first, it establishes the baseline performance to be used as a reference point - the performance of the other, more complicated architectures will be compared to it. And second, the results of this analysis will be used in the analysis of the more complicated architectures - as we'll see later, often these architectures can be treated as the 'flat' ones, with the superpeers forming an effectively 'flat' network, whose performance can be evaluated according to the formulae from this section.

3.1. Identical-host symmetric link network.

   The simplest case of the flat architecture is when all hosts in the infinite network have the same bandwidth, and the incoming and outgoing bandwidth is the same:

(3)

   The mechanism used to propagate the query messages through the network is not important here - it might be Gnutella flooding, random walkers, or just direct requests from one host to another. What is important is that every host receives these query messages and performs the searches in its local database:

Figure 4. Query messages received by the host.

   Obviously the average receiving host ('e' on Figure 4) cannot receive more than B bytes per seconds of these messages, so its limit on the number of local searches performed per second is , and the total number of local searches per second performed by all hosts in the network is .

   Since in the flat network architecture all hosts are functionally equivalent, all these searches are initiated by the same hosts that are performing them. So also gives us the average number of local searches per second that can be initiated all over the network by a typical querying host. And since these 'remote' local searches are the result of a querying operation performed by an average host once in seconds, the total number of hosts reached by this querying operation is:

, (4)

and the total number of data items (files) covered by the search is:

(5)

   Of course, in order to achieve such a search coverage, all network hosts should agree to perform a certain amount of data transmission - if the average host sends out bytes of data per second, then obviously network-wide , and

(6)

   So in case of the symmetric network links the query reach (4), (5) is possible only if all hosts fully utilize their outgoing bandwidth - whether by broadcasting queries for other participants, performing 'random walking' on their behalf, or just sending multiple direct queries themselves (not that the last variant is recommended for a practical peer-to-peer system, but from the traffic and query reach standpoints it is equivalent to the previous ones).

   For the parameter values defined in (1), (2) (modem hosts only), the search request can reach about

hosts, (7)

which is close enough to the values observed on the real Gnutella network (5,000-10,000 hosts). In this study we do not seek to achieve the exact match of the performance numbers, but rather to find out the laws that determine the search performance and to suggest the ways of its improvement.

3.2. Identical-host asymmetric link network.

   After the analysis performed in the previous section, it probably won't come as a surprise that if all hosts in the peer-to-peer network have identical asymmetric links (ADSL, cable modem, etc), the network performance (request reach) will be defined by the lower bandwidth value:

(8)

   For the real ADSL and cable modem connections the outgoing bandwidth is likely to be the lower one, so in practice the query reach in such a network will be limited not by a high, 1.5 Mbits/sec+ values of the incoming bandwidth, but by the much lower (128-384 kilobits/sec) outgoing one.

   This is quite natural - if someone receives the data, someone has to send it, and since in our sample network all hosts have the similar connections, only the outgoing network links will be saturated, whereas the incoming ones will be mostly idle. If the network wouldn't be 'flat', we could've found some other uses for this excess capacity, but that will be the subject of the subsequent sections.

3.3. Heterogeneous symmetric link network.

   Now let us consider the case of the symmetric-link network, in which different hosts have different bandwidth. Let's say that some hosts (we'll denote their percentage as ) have the higher bandwidth than the rest of the hosts, which have the bandwidth .

   The total number of local searches that can be performed by all hosts in such a network in one second is , and, repeating the reasoning that led us to (4) and (5), we arrive to the request reach of

, and (9)

where is the average host bandwidth:

(10)

   It is easy to see that (9) stays correct also for the arbitrary bandwidth distributions - then if we have hosts in the network,

(11)

   Naturally, (9) and (11) define only the maximum achievable network performance. For these numbers to be actually reached, the low-bandwidth and high-bandwidth hosts have to find such a way to coexist on the network that would assure the full bandwidth utilization for the high-bandwidth hosts.

   If, for example, the majority of the Gnutella hosts are connected through the modems and have five connections to the other hosts, then each connection carries about one kilobyte of traffic per second. So if we open the same five connections on a well-connected host, it will route only five kilobytes per second - exactly the same amount of data as a modem one, and won't improve the network performance at all. Its bandwidth will be grossly underutilized.

   This is why typically the Gnutella vendors try to open more connections for the better-connected hosts, which allows the network as a whole to fully utilize their resources. Ideally the number of Gnutella connections for the host should be proportional to its effective bandwidth - the one after various bandwidth caps (as defined in Section 2.6) are taken into account.

3.4. Arbitrary bandwidth case.

   Now we are ready to analyze the case of the network in which every host can have an arbitrary incoming and outgoing bandwidth and .

   From the previous sections it is clear that in order to maximize the search request reach in such a network, we need to maximize the total query traffic over the whole network. This traffic is limited by the lower of the summary sending and summary receiving capacities of all network hosts:

   

(12)

   So the search request reach in this case is determined by the equation similar to (9):

, and (13)

- except that the average bandwidth is defined as:

(14)

   In the practical Internet case, it is likely that the outgoing bandwidth of the asymmetrically connected hosts will be lower that the incoming one, so we'll be able to find the average bandwidth as:

(15)

   Here we use the index '0' for the bandwidth and request reach variables to define the search performance of the arbitrary-bandwidth flat peer-to-peer network. These variables will be used later to compare this 'baseline' performance with the performances of the other, more complicated architectures.

   Let's take a hypothetical peer-to-peer network in which most of the participating hosts are connected through the modems with the 5,000 bytes per second transfer limits, but 10 percent of the hosts are using the cable modems with the 12,000 bytes per second uplink (outgoing) bandwidth, and another 10 percent - ADSL links with the 20,000 bytes per second uplink bandwidth:

Percentage of hosts Outgoing bandwidth (bytes/sec)
80 5,000
10 12,000
10 20,000

Table 1. Sample host bandwidth distribution.

   For such a model network, the average bandwidth value will be:

bytes per second, (16)

and, taking the variable values from Section 2.5, the number of hosts and files reached by an average search request will be:

hosts (17)

and

files. (18)

   Here we are not trying to accurately approximate the hosts' bandwidth distribution for the Internet or some real peer-to-peer network. The goal here is to provide an example of a network with the different hosts present and to see how such a distribution affects the search performance, though the values used in this example loosely reflect the actual residential broadband adoption percentages in the United States.

   The equations (13)-(15) show that the search performance of the 'flat' peer-to-peer network is determined by the average host bandwidth, query message size, number of the data items (files) stored by the average host, and the interval between queries.

   All these components can be - and are - used to maximize the search performance. For example, all existing file-sharing networks try to provide the incentives for the users to share more files and to stay longer on-line, thus increasing the average number of stored files and the interval between requests (the latter effect is caused by the increased session length that allows the host to route and answer significantly more queries while initiating the same or just slightly increased number of searches itself).

   Similarly, the continued broadband adoption growth will improve the search quality in a network due to the gradual average bandwidth increase. For the purposes of this study, however, we will consider such improvements to be orthogonal to our main goal, which is to increase the request reach without changing the values of these parameters.

4. Superpeer Architectures.

   Now we can use the results obtained in the previous section to analyze the search performance of the peer-to-peer networks with different topologies. Let's start with the plain superpeer architecture.

4.1. Plain superpeer network.

   It has been noted long ago that the traffic in the Gnutella network tends to saturate the dial-up modem links ([21], [22]), thus limiting the search request reach in the network. So the natural countermeasure was to isolate the modem hosts from the query routing, selecting a subset of well-connected hosts called 'superpeers' and making them bear the routing load on behalf of their 'leaf nodes'. Such a superpeer network can be represented by a graph similar to Figure 1(a) in [9], or Figure 5 below:

Figure 5. Query delivery in a superpeer network.

   Figure 5 shows the query message delivery in a network with four superpeers a, b, c, and d. Here the search request is delivered by a leaf node to its superpeer a, which propagates it to the superpeers b, c, and d. Every superpeer has an index of all the content stored on its leaf nodes, so instead of delivering the query message to all the network nodes, it has to be delivered only to the superpeers, which can perform the local searches in their index databases without actually querying their leaf nodes.

   The actual mechanism of the message delivery to the superpeers is not important for us here - it just has to deliver the query message to other superpeers. It can be a Gnutella breadth-first traversal (BFS), random walkers [11], some modified form of the LimeWire GUESS proposal [12] that would allow a superpeer to query all other superpeers directly, or some other method used by the superpeers to communicate with each other.

   In any case, this query message delivery architecture seems to be much more effective than the one shown on Figure 1, since only the superpeers are involved in the message routing. Let's find the search performance (query reach) of such a network.

   First, let us note that the request traffic from the leaves to their 'home' superpeers is very low - the traffic from a single leaf node can be found as , which is about 0.5 bytes per second per leaf in assumption (1).

   Another component of the leaf-to-supernode traffic is the 'join' component, arising from a necessity to transfer the full leaf index to the supernode when the leaf host joins the network. This traffic can be calculated as , or about 5.5 bytes per second per leaf when amortized over the average session length . So if the number of supernode leaves is less than several hundred, their summary bandwidth usage is much lower than the total bandwidth of the well-connected supernode, and we can discount the supernode bandwidth consumed by the leaves.

   As far as the traffic analysis is concerned, this allows us to effectively erase the leaves from the Figure 5. After such an operation is performed, this chart becomes remarkably similar to the Figure 1. In fact, the only difference between the flat peer-to-peer network investigated in Section 3 and the flat peer-to-peer network consisting of only the superpeer hosts will be in the number of files locally searchable by the host , interval between queries , and i-th host bandwidth . Let's find these parameters for the superpeer network.

   Introducing the average number of superpeer leaves , it is easy to see that the number of files locally searchable on the superpeer - including its own files - will be:

files, (19)

and the queries will be generated by a superpeer for every query generated by its leaves and by a superpeer itself, so its average interval between queries can be found as:

(20)

   Substituting these values into the equations (13)-(15) that describe the flat peer-to-peer network performance, we can see that the query reach of the superpeer network will be:

superpeers, files (21)

reachable by a search, where

(22)

is an average superpeer outgoing bandwidth (this presumes that the outgoing bandwidth is not higher than the incoming bandwidth for all superpeers - if this is not the case, the analog of the equation (14) will have to be used.)

   It is also convenient to introduce an effective number of reachable hosts, defined as the number of the regular leaf hosts that would have to be searched to achieve the same number of files covered by the search:

(23)

- then from the equation (21) we can find the effective number of reachable hosts in a superpeer network as:

(24)

   Now if we define the ratio of the superpeer network search coverage to the regular, flat network search coverage as the search improvement coefficient , this improvement can be found from (13), (21), and (24) as:

, (25)

meaning that the search improvement is proportional to the ratio of the average superpeer bandwidth to the average peer-to-peer network host bandwidth.

4.2. Plain superpeer network: explanations.

   The result (25) is so unexpected and counterintuitive that it deserves a special explanation. The comparison of the Figure 1 and Figure 5 clearly suggests that even with the same bandwidth, the superpeer network search is inherently more effective than the search in the flat network. One might even expect the search efficiency and reach to be proportional to the number of superpeer leaves .

   Not quite so.

   The confusion - as it is often the case with peer-to-peer networks - is caused by a naive geometrical model of the search operation that is implicitly suggested by Figure 1 and Figure 5. We see the decreased number of arrows between blocks and assume that it directly translates into the traffic decline. Not really.

   What Figure 5 fails to convey is that as the number of superpeer leaves grows, so does the query traffic generated by this superpeer. And since the search query reach is ultimately limited by the connections' saturation (see Section 2.4), the better-than-average superpeer bandwidth is an only factor that allows the searches to cover more content items (files) in the plain superpeer network.

   Let's consider an example - a network with a host bandwidth distribution from Table 1. Its search performance in the 'flat network' case is defined by (17) and (18). If every broadband host will be selected as a superpeer, the average superpeer bandwidth will be 16,000 bytes/sec, and the effective number of searchable hosts and files will be 32,000 and 6.4*10^6 correspondingly. This gives us an improvement over the flat peer-to-peer network of:

, (26)

   If, however, we select only the better-connected 20,000-bytes per second hosts for the superpeer role, the effective number of searchable hosts will be 40,000, the number of searchable files - 8*10^6, and the improvement coefficient will be:

, (27)

   Now if we imagine that this network also contains enough hosts capable of supporting 100,000-bytes per second data streams to make the rest of the network hosts (broadband or not) their 'leaves', the search coverage will grow to 200,000 hosts, 4*10^7 files, and the improvement will be:

, (28)

   The key word in the last scenario is 'enough'. After all, the large network is virtually guaranteed to have some very well-connected hosts. The question is, would their number be sufficient to take all other hosts as leaves? Every leaf consumes about bytes of RAM on a superpeer, and typically requires a TCP connection to be kept open between the superpeer and the leaf (here we assume that all leaf indices have to be stored in RAM by a superpeer, since the disk access times might make it very difficult to process hundreds of queries per second otherwise).

   Assumption (1) implies about twenty kilobytes of RAM occupied on the superpeer by every leaf's storage, so having 500 leaves will use ten megabytes of the superpeer RAM, and five hundred connections. If there are enough network hosts (more than 0.2 percent) that can take this load and still deliver the 100,000-bytes per second data streams, the improvement (28) can be achieved. If not, less capable hosts will have to be used as the superpeers, lowering the average superpeer bandwidth and search request reach.

   These examples show that in the superpeer architecture it is very important to select the best and only the best hosts for the superpeer role, maximizing their number of leaves in the process. Even if some host has enough resources and is well-connected enough to be a superpeer, it should not become one unless some better-connected superpeers are overloaded or the bandwidth of this 'candidate' host is higher than the current average superpeer bandwidth. Doing otherwise will harm the search in the network.

   Such superpeer selection decisions, however, are very difficult to make on the local, 'node' level in the completely decentralized peer-to-peer networks. In the existing Gnutella algorithms, for example, every host decides for itself whether it can and will be a superpeer or not. To do so in an optimal fashion - or at least, without harming the existing network - it must have some global, network-wide information (current average superpeer bandwidth, etc), which is difficult to obtain in the ad hoc network. First, this information is typically not available to a network host. (In fact, currently it is not available to anyone but the network researchers.)

   And second, any attempt to collect and distribute this information in a decentralized fashion should be carefully designed to be protected against the malicious agents that might try to use these channels to mount an attack on the network. One example of such an attack would be to widely distribute the very low fake value for the current average superpeer bandwidth, which might cause many low-bandwidth hosts to become superpeers and significantly decrease the quality of the network search.

   This is why it is extremely tempting to employ the centralized methods of the superpeer selection. Some central location might have information about all the existing superpeers and carefully monitor their performance and load, selecting only the best-connected hosts for the superpeer role. In fact, there are some indications [13] that this is exactly what Kazaa is doing with its 'index servers'. With up to 500 leaf nodes per superpeer, the performance of the Kazaa network can be as high as it is allowed by the effective average bandwidth of the top 0.2 percent of the network hosts, which can be quite impressive. Due to the closed character of the Kazaa network, however, this hypothesis is hard to verify.

   And in any case, any centralization introduced into the peer-to-peer network is an additional liability - in both technical and legal sense. Which are remarkably close on the architectural level - court decisions, hacker attacks, and peak loads alike can harm any network functionality that is not thoroughly distributed over a super-redundant array of hosts.

4.3. Non-routing superpeers.

   An interesting variation of the plain superpeer architecture has been recently suggested by LimeWire in its GUESS proposal [12]. This proposal essentially suggests to remove the query routing functionality from the superpeers and make every leaf node query the superpeers directly, as shown on Figure 6 below:

Figure 6. Direct leaf-to-superpeers query delivery.

   The primary goal of this approach was to have a simple way to stop querying when a desired number of results is reached. This architecture, however, also has an interesting additional property - it does not use the outgoing (transmission) capacity of the superpeers. All query messages are transmitted by the leaves, and superpeers have to send only the answers. And the answers typically do not consume a lot of bandwidth in a file-sharing network, unless the number of average superpeer 'leaves' becomes so large that the network is effectively 'Napsterized' - and significant percentage of queries can be answered by any superpeer (see Appendix A). Such a case, however, requires some very powerful superpeers and is not considered here.

   So since the superpeers are not using their outgoing bandwidth, this bandwidth stops being a bottleneck, and the summary network traffic is becoming constrained by the incoming bandwidth of the superpeers. To be precise, it is also constrained by the summary outgoing bandwidth of the leaf nodes, but in a practical network design it makes sense to make the summary leaves' uplink capacity much higher than the summary superpeers' downlink one.

   The reason for this is that since every leaf node transmits only its own queries, the equality of the summary bandwidths of leaf uplinks and superpeer downlinks would mean that the leaf would spend, on the average, seconds transmitting the same query to the different superpeers. That, in turn, implies the search latency (why would the leaf host keep transmitting if it would already have the result?) of about , or 200 seconds.

   Such a search latency is excessive from the user interface standpoint, so it makes sense to organize a network in such a fashion that an average leaf would query all the superpeers that he can possibly query much faster than that - maybe even in just ten or twenty seconds. For the rest of the interval the leaf uplink would be idle, so it would be the summary superpeer downlink capacity that would determine the search performance of such a network.

   This condition can be written as:

, (29)

- so for the superpeer selection scenario which led to (28) (100 kilobytes per second average superpeer bandwidth), it is enough to have 14 leaves per superpeer to satisfy the 'superpeers are the bottleneck' condition, and 140 leaves per superpeer - to bring the typical search time to 20 seconds. Such leaf numbers can probably be handled by a typical superpeer without putting an excessive load on it.

   Once a condition (29) is met, the search performance of such a network starts to be defined by the equations (21), (24), and (25) - but the average superpeer bandwidth in this case is defined as:

(30)

instead of the equation (22). Now, since a typical residential broadband host has an incoming bandwidth that significantly exceeds its outgoing bandwidth, the performance of a superpeer network with this architecture can significantly exceed the performance of the superpeer networks that led us to (26) and (27) - even with exactly the same superpeer selection algorithms.

   If we roughly estimate the average superpeer incoming bandwidth to be from 500 kilobits to 1.5 megabits per second, we might expect the search performance increase coefficient to be between 7 and 20 over the 'flat' peer-to-peer network search performance. That improvement might be achieved even with the simplest 'local' superpeer selection algorithm that would allow every broadband host to become a superpeer provided that its number of leaves satisfies (29).

   Of course, such a performance increase presumes that the GUESS algorithm is flow-controlled or has some other provisions to gracefully limit the superpeer load coming from the multitude of hosts that are simultaneously bombarding the superpeer with requests. The early version of GUESS [12] quoted here does not address this issue yet.

4.4. Redundant superpeer clusters.

   Another approach to the search improvement in the superpeer network is to introduce the redundancy into the superpeer architecture.

   Many practical developers have been doing exactly that - opening the connections from the same leaf node to multiple superpeers - for a long time. For example, every Kazaa [2] leaf node seems to maintain the connections to three superpeer nodes [13]. However, typically the rationale for this behaviour was the need to keep the leaf node connected to the network in case the superpeer to which it is connected goes off-line.

   The study [9] pioneered another look at the superpeer redundancy. The authors introduced the 'k-redundancy' as a special variant of superpeer design and showed that this design can dramatically increase the search effectiveness. Here we'll analyze the k-redundancy from the search performance (query reach) perspective.

   The idea behind the k-redundancy is to group k superpeers together and upload the same leaf indices on all of them - Figure 7, for example, shows a 2-redundant superpeer network:

Figure 7. 2-redundant superpeer network.

   A leaf node sends a query to all the superpeers it's connected to, or to a random one. These superpeers or superpeer retransmit this query to other superpeers. The retransmission method details are not important here, as well as whether the original query is sent to one or to several superpeers, since we are mainly concerned by the limits imposed on the query propagation by the combined bandwidth and search capacity of all the other superpeers in the network. What is important is that we assume the absence of redundancy in the network (Section 2.1). In the Gnutella network case it means that every superpeer randomly chooses its connection points into the network, regardless of the choices made by other superpeers - including the superpeers in the same k-redundant cluster.

   So in the infinite network assumption (Section 2.2) the probability of the same query arriving from the different superpeers to the same superpeer or even into the same k-redundant superpeer cluster is negligibly low. That makes the query message delivery look like shown on Figure 8:

(a)

(b)

Figure 8. Query delivery in a 2-redundant superpeer network.

   Here every cluster (group of superpeers) acts as a 'composite superpeer'. Such a composite superpeer has k times more bandwidth than an average superpeer, and the query arriving to it causes a local database with files to be searched. is the number of leaf hosts of a supereer, so since the superpeers in the same k-redundant cluster are not directly connected, the query arriving to the cluster can access only the 'leaf files' and the files belonging to the superpeer that receives the query - the files that belong to all other superpeers in that cluster are not accessed.

   The average interval between queries generated by a k-redundant superpeer cluster is , so similarly to (21), the query reach in such an architecture is:

superpeer clusters, and files (31)

   Comparing this to the flat peer-to-peer network search performance (13), we see that the performance increase coefficient can be found as:

(32)

   Note that the superpeers in the same cluster on Figure 7 and Figure 8 are not connected directly. The reason for this is twofold: first, since most of the files are identical on all superpeers in the redundant cluster, connecting these superpeers will waste a lot of CPU and bandwidth resources. In fact, it might virtually erase all the performance benefits of the redundant superpeer architecture.

   And second, an explicit grouping of the superpeers and their leaves into clusters as shown on Figure 7 and Figure 8 is complicated and difficult to achieve and maintain. In practice, it is much simpler to just open k connections to random superpeers from every leaf node and send every query to these k superpeers.

   In this case the clusters are not even explicitly defined and layed out - however, this practical and easily implementable approach to superpeer redundancy is statistically (traffic- and search-wise) equivalent to the explicit k-redundant clustering. The query propagation in such a network is shown on Figure 9:

(a)

(b)

Figure 9. 2-redundant superpeer network with no explicit clustering.

   In order to understand why this 'implicit clustering' method is equivalent to the explicit one in Figure 7 and Figure 8, let's start with a plain superpeer network like the one analyzed in Section 4.1, but with leaf hosts connected to every superpeer. The average query reach (21) in such a network will be:

superpeers, files (33)

   Now let every leaf node establish another k-1 connections to the randomly selected superpeers and transfer its file index to them. This operation won't change the number of superpeers reached by the query (assuming that the superpeer bandwidth consumed by such an index transfer is small enough, as stated in Section 4.1), but will increase the number of files locally searchable on every superpeer to . So after that operation, the query reach (33) will be transformed into:

superpeers, files (34)

- which is exactly equivalent to the search performance of the explicitly clustered k-redundant superpeers (31).

   This analysis also explains why the performances of the searches in Figure 8 and Figure 9, (a) and (b) are the same. The performance gain is achieved not because the original query goes to more superpeers from the leaf node - if all leaf nodes use the same initial querying strategy, the flow control will limit the propagation of their queries to the same average number of superpeers regardless of the query message delivery mechanism. Instead, the performance gain is due to the fact that in the redundant clustering approach the information about the data items (files) stored on the leaf nodes is more widely disseminated throughout the network and becomes locally searchable on the additional superpeer hosts.

   So the choice between the initial query sending methods 8-9(a) and 8-9(b) should be determined by other considerations than the average query reach. The multiple-superpeer query launch case 8-9(b), for example, can have the lower search latency, since more queries are injected into the network simultaneously when the search is started. But on the other hand, it might make it more difficult to control the number of the returned results if the superpeers would be using the 'random walkers' [11] - since all the initial queries are independent, it is difficult to control the cumulative total number of results returned by these queries, especially if the redundancy level k is high.

   If, however, the network uses the plain Gnutella [1] brodacasts to deliver the query messages to the remote superpeers, it might make sense to control the number of returned results by sequentially (after a small timeout) launching multiple queries from the leaf node to the different superpeers. That approach will improve the overall network effectiveness at the cost of the increased search latency. As it was already said, though, such considerations are outside the scope of this study and are orthogonal to the optimization techniques discussed here. This whole subject is mentioned only to clarify the issue of the different initial query launch scenarios.

4.5. Optimal redundancy level.

   Now let's see what would be the optimal redundancy level k for the k-redundant superpeer clusters analyzed in Section 4.4. The study [9] limits its analysis with k=2, since the number of the network connections grows as k squared when k is increased. This limit, however, is largely an artificial one - after all, the number of connections grows only linearly on every network host, so the amount of connection-related host resources required for k-redundant clustering is linearly proportional to k.

   In fact, the Kazaa file-sharing network [2] seems to have every leaf node connected to three different superpeers [13], which is equivalent to 3-redundant clustering and is probably one of the reasons why its search performance is superior to the competing networks like Gnutella [1]. Also, it is easy to see that when k is much smaller than , the search query reach (31) and search performance increase coefficient (32) grow linearly with k. This kind of search performance improvement is not something to be discarded lightly.

   So here we'll try to find the redundancy level that would optimize the search performance without constraining ourselves with any artificial limits for k, but rather trying to establish what are the real redundancy growth limits in a peer-to-peer network.

   To begin the analysis, let's assume that the average superpeer bandwidth is not affected by the redundancy level k. This is not true, of course. The k-redundant superpeer network requires k times more hosts in the superpeer roles than the plain superpeer network analyzed in Section 4.1. So in reality increasing k requires less and less capable hosts to become superpeers, which inevitably lowers the average superpeer bandwidth. But for now, let's just assume that in our k-redundant network no special bandwidth-related superpeer selection is performed, so their average bandwidth is equal to the average network host bandwidth as defined in (14)-(16):

(35)

   Then all the search performance improvements due to the better-than average superpeer bandwidth will become a 'bonus'. Later on, we'll also show what will be the size of that 'bonus' for a network with a host bandwidth distribution specified in Table 1.

   In the assumption (35), the performance increase coefficient (32) becomes:

(36)

   For k>1 (that is, when the superpeer network is redundant to at least some extent) this function grows when k or are increased. So in principle we could achieve the infinite search coverage by simultaneously increasing the values of these arguments. 'Simultaneously' is important here - function (36) obviously cannot exceed . In practice, though, there are several reasons why this infinite coverage won't be possible:

   Since most of the hosts in a peer-to-peer network are likely to be the consumer-grade PCs, let's see what happens when the superpeer redundancy level is limited with:

k=100 (37)

   The plot of the search improvement coefficient function (36) versus the number of superpeer leaf nodes for this redundancy level is presented on the Figure 10 below. (As suggested by (35), the superpeer hosts are not selected for their superior bandwidth here and have the same average bandwidth as the network as a whole.)

Figure 10. Search improvement in a 100-redundant superpeer network.

   Here we see that when there are 500 leaf nodes for every superpeer, this redundant superpeer network can improve the effective search request reach by a factor of 84 over the flat peer-to-peer network analyzed in Section 3. Further increase of the number of superpeer leaves does not seem to be justified, since it won't dramatically improve the search performance anyway - with this redundancy level, the performance improvement coefficient cannot exceed one hundred regardless of value.

   Even though the required number of superpeer nodes grows when the redundancy level is increased, with k=100 and only about 1/6 of all network hosts have to become superpeers. This makes it possible to further improve the search performance by selecting the best-bandwidth hosts for the superpeer roles when is increased. The more leaf nodes the superpeer has, the less superpeers are required, which should increase the average superpeer bandwidth and search query reach as defined by the equation (32).

   Figure 11, for example, shows the plot of the search improvement coefficient versus the number of superpeer leaf nodes when the network has the host bandwidth distribution specified in Table 1:

Figure 11. Search improvement in a 100-redundant network with best-bandwidth superpeers.

   We see that when , it is possible to select only the broadband hosts for the superpeer roles, and the search improvement coefficient can be 195 vs 84 in Figure 10. At this point the average superpeer bandwidth can be as high as 16.8 kilobytes per second as opposed to 7.2 kilobytes per second for the network as a whole. Increasing the number of superpeer leaf nodes further, we can have only the best-bandwidth hosts (20 kilobytes/sec) in the superpeer roles, which could bring the search improvement coefficient to 250 if these best hosts would be capable of supporting 900 simultaneous TCP connections and would have about 18 megabytes of RAM to spare for the leaf nodes' indices.

   The last two assumptions are probably not very realistic, but even the 200-fold search improvement is very impressive. Comparing this with the flat network performance (17) and (18), we see that such a massively redundant superpeer network is capable of searching the content on

hosts, (38)

with a total of

files (39)

accessible for the search. These numbers are coming close to the ones necessary to perform a full search on the biggest of the modern file-sharing networks like Kazaa or Gnutella (though not quite achieving that goal - Kazaa is currently reporting up to 3M+ hosts on-line at the same time; Gnutella numbers are probably similar).

   It is also important to note that when these numbers for redundancy and leaf nodes per superpeer are used, the index transfer bandwidth limits mentioned above won't play a significant role yet. A 5,000-bytes per second modem host can fully transfer its 20-kilobyte index to 100 superpeers in about 400 seconds, which is about ten percent of its one-hour average session time. A 16,000-bytes per second superpeer would be able to receive 500 leaf indices in about ten minutes - but it probably won't have to. There is a high probability that the broadband superpeer will have an asymmentric link, whose incoming bandwidth will be much higher and mostly unused anyway.

   So the performance numbers (38) and (39) should not decrease by more than ten percent even when the index transfer bandwidth will be taken into account. More serious is a problem of the uniform superpeer load. If, say, an average superpeer is RAM-bound and has a limit of 500 leaf hosts, a special care should be taken to always keep its number of leaf nodes as close to this level as possible. Figure 11 shows that if, for example, the network connection maintenance algorithms cannot fully load an average superpeer, and it has only 250 leaf nodes, then the performance improvement coefficient will drop to only 125, which is much worse than the performance numbers above. Even worse, some modem-connected hosts will become superpeers, and the search performance will be further degraded, since their effective bandwidth will be lowered even more because of the bandwidth consumption by the index transfers, which might take as much as half of the average session time (2,000 seconds to transfer 500 leaf node indices). This problem is discussed in detail in the next section.

4.6. Superpeer network problems.

   Even though the superpeer architecture has a potential to deliver a very significant search perfomance increase, the efficient superpeer usage necessary for that increase might prove to be surprisingly hard to achieve. When the superpeer host goes on-line, someone has to direct the leaf hosts to connect to it as quickly as possible - otherwise it may take quite a while (possibly a full superpeer session duration) to load it to full capacity. In that case the average number of the superpeer leaf nodes might be only half of its peak capacity, causing the performance drop discussed at the end of the previous section.

   Similarly, the redundancy level optimization techniques discussed in Section 4.5 assumed that the network can be optimized from a central location that has a full knowledge of the network host bandwidth distribution, of the practically achievable redundancy level, and of the maximum number of superpeer leaf nodes for the existing network hosts.

   If this is not the case, and the network is fully decentralized, every host has to make its own decisions about becoming a superpeer or not, which can significantly affect the network performance. If, for example, too few hosts decide to become the superpeers, the leaf nodes might not have enough redundant connections, since the superpeers will be fully loaded with leaf indices and won't accept more leaf nodes. That will cause the search performance to drop almost proportionally to the redundancy level k decrease.

   If, on the contrary, there are too many superpeers, they won't be able to fully utilize their storage capacity, causing to drop and also decreasing the search performance (31), (32) in the process.

   It is not entirely clear how to achieve the full superpeer utilization without a centralized (or semi-centralized) connection director and superpeer role assigner, whose function might be performed by some network component similar to Kazaa index servers [13]. (It is not clear whether Kazaa index servers are really trying to keep the number of leafs on every superpeer host as close to its maximum capacity as possible. This functionality is not crucial for Kazaa, since it seems to have a relatively low redundancy level of 3, and the issue of maximizing might be not very important to it - in any case, secondary to the maximization of the average superpeer bandwidth discussed in Section 4.2).

   So the fully decentralized superpeer network is very likely to be in a less-than-optimal state most of the time. For example, the realistic average number of leaf nodes might be about 250, and the search improvement coefficient - about 125, as we can see from Figure 11. Or the average redundancy level k might be 50, which will give us about the same search improvement coefficient value of 125 - the performance drop is not exactly linear, since with the reduced k less superpeers are required and their average bandwidth increases.

   So the realistic numbers for the search query reach in a fully decentralized supepeer network might be significantly lower than the ones suggested by (38) and (39). For the search improvement coefficient value of 125, these numbers will be:

hosts, files (40)

- assuming that the decentralized superpeer role assignment algorithm will be effective enough to achieve either the optimal redundancy with a 50-percent superpeer load, or the full superpeer load with a 50 percent of the optimal redundancy level. If the decentralized superpeer assignment and/or the connection director algorithms would not be good enough to do that, the search performance will be degraded further. Little research was performed on the decentralized superpeer assignment and connection management algorithms, so it is difficult to predict the actually achievable numbers here.

   Finally, even these results depend on the fact that the superpeers have the asymmetric bandwidth with a high downlink capacity. If that wouldn't be the case, a significant share of the bandwidth normally used for the search would be used for the index transfers. With the average session time , a superpeer with the incoming bandwidth can receive bytes of data over its session lifetime. Since the connection between the superpeer and its leaf can be broken when either the superpeer session or the leaf node session ends, the average connection lifetime is two times smaller than the average session length . This means that with the average number of superpeer leaf nodes there would be, on the average, two index transfers for every superpeer link over the superpeer session time, and the number of index transfer bytes received by the superpeer during the session will be (a somewhat more detailed explanation of that average connection lifetime value is provided later, in Section 5.3 - see the equation (46)). Thus the effective superpeer search bandwidth can be found as:

(41)

- or, in the assumption (1), approximately 2,800 bytes per second less than the original superpeer bandwidth, if the average number of superpeer leaf nodes is 250. When the equation (41) is used to determine the average superpeer bandwidth for the search improvement coefficient function (32) instead of the original host bandwidths from Table 1, the chart on Figure 11 is transformed into Figure 12:

Figure 12. Search improvement in a 100-redundant network with symmetric superpeer links.

   This chart is interesting for two reasons - first, it shows that if the superpeer links are symmetric, the search performance is degraded further, bringing the search query reach close to just a million and a half hosts when the average number of superpeer leaf nodes is about 250 (half of the full superpeer capacity, assuming that an average superpeer has about 10 megabytes of RAM to spare for the storage of the leaf nodes' indices). And second, it shows that when the bandwidth used to transfer the leaf indices to the superpeers cannot be discounted as negligibly small, there is an optimal number of the superpeer leaf nodes (close to 500 in our model case), and further increase of their number actually harms the search performance.

   This effect is caused by the decrease of an effective average superpeer bandwidth in the equations (31) and (32) when the superpeer has too many leaf nodes and has to receive too much index data from them. Naturally, this is also the case in the asymmetric-bandwidth superpeer scenario - it is just that in our sample network defined by the assumption (1) and Table1, the incoming superpeer bandwidth was presumed to be sufficiently high for this optimum to be outside the plot area on Figure 11. The lack of smoothness of the curve on Figure 12 is caused by the discrete character of the network hosts' bandwidth distribution specified in Table 1. After 400 leaf nodes no modem hosts are used in the superpeer role anymore, and after 900 - only the best-bandwidth hosts are being used as the superpeers. The similar plot for the real peer-to-peer network should not have these clearly visible anomalies, although the general character of the plot should stay the same.

5. Mutual Index Caching.

   Another approach to the peer-to-peer search optimization is the mutual index caching. In this approach the peer-to-peer network hosts store not only the indices of their own data, but also the indices of the data located on other hosts. Doing so requires every peer-to-peer host to set aside some space for these indices and to somehow obtain them from the other network hosts, effectively becoming a superpeer.

   This approach has been repeatedly suggested before ([10], [11] and multiple suggestions about passive query reply monitoring on Gnutella Developer Forum [1], to name just a few). It is important that here we are talking about mutual caching of only the indices - not the content itself. Storing the content on behalf of other users can be very burdensome for the user and for the host hardware, whereas the requirements for the mutual index-only (metadata) caching are much lower.

   It seems to be clear that disseminating the content throughout the network should make it easier to find. Little quantitative analysis, however, has been performed in that area, and this study tries to correct this oversight. Our goal here is to suggest a caching strategy with the widest possible search query reach. This goal is not the only one imaginable - study [10], for example, did perform the quantitative analysis and simulation of the 'Local Indices' mutual index caching strategy, but in an assumption that the strategy goal is to minimize the system load (query cost) in terms of the processing power and bandwidth when the number of returned search results is fixed.

   This goal led the authors to recommend the caching depth of 1 for the Gnutella network with a query-to-join ratio of 10, which in our terms would correspond to the average session length of 2000 seconds instead of 3600 seconds in (1). This caching depth means that every Gnutella node should store the indices of its immediate neighbours. Study [10] also suggested that the caching depth should be increased if the nodes would spend more time on line, and their query-to-join ratio would be higher.

5.1. Mutual index caching performance.

   Let's have a look at the mutual index caching performance when the aggregate system load is of secondary importance, and the primary goal is to increase the query reach.

   Looking back at the analysis of the flat peer-to-peer network performance in Section 3, we can see that the mutual caching allows every host to have more files available for the search. Designating the number of hosts whose indices are made accessible on the average network node in addition to its own index as , we'll have the number of files searchable in such a network change from a value determined by (13) to:

files, (42)

which gives us a search performance improvement coefficient of

(43)

- so for the Gnutella network with four connections per host, the mutual caching with a depth of one will improve the effective query reach by a factor of up to five.

   It is not a coincidence that we're using the variable for the number of other hosts' indices stored on an average peer-to-peer network host. This variable was already used in Section 4 to designate the number of leaf nodes for the superpeer host, and is reused here because in both cases this variable defines how many 'foreign' indices are stored on the host. The limits and expected values for this variable are very similar, whether it is the number of superpeer leaf nodes or the number of mutually cached indices.

   Comparing the search improvement coefficient (43) with (32) and with the plot on Figure 11 that shows the best possible search performance (when the index transfer bandwidth is not a limiting factor) for the 100-redundant network defined by the assumption (1) and Table 1, we see that the flat network with the mutually cached indices has a potential to significantly outperform the k-redundant superpeer network. Figure 13 below shows the search improvement coefficients for 100-reduntant network (same as on Figure 11) and for the flat network with the mutually cached indices on the same chart:

Figure 13. Search improvement in a 100-redundant network and in a network with mutually cached indices.

   If every network node caches 500 indices from the other network nodes, the search improvement coefficient for the mutually cached indices network will be 501, and the potential (best achievable) search reach for our model network can be as high as:

hosts, files. (44)

- which is 2.5 times better than the 100-redundant superpeer network result (38), (39).

   This is hardly surprising, of course. The power of the peer-to-peer network is ultimately determined by its ability to harness the power of the individual network hosts, making every host contribute to the best of its abilities, however meager these abilities might seem at the first glance. Both mutual index caching and redundant superpeer clustering approaches achieve their search improvements by massively disseminating the data indices of network hosts over a large number of nodes - remember, that was exactly the reason why the k-redundant superpeer outperformed the plain superpeer network (see the analysis in Section 4.4).

   The difference between mutual caching and superpeering is that for a variety of reasons, the superpeer-based approaches try to fully utilize the local search and query routing abilities of only the best-connected network hosts, discarding the potential contribution of the less-capable ones.

   The summary contribution of these less-capable nodes, however, may be quite substantial, as it can be clearly seen on Figure 13. We can see that as the number of stored indices grows, the difference between the superpeer and mutual caching performance becomes more and more pronounced. The reason is quite simple - as the number of superpeer leaf nodes grows, less and less hosts are becoming the superpeers and performing the local database searching and query routing. Of course, at the same time the average superpeer bandwidth grows, but this effect cannot overweight the fact that the cumulative network potential becomes more and more underutilized.

   As a matter of fact, mathematically speaking, the equations (42) and (43) represent the best theoretical search performance of the peer-to-peer network with a non-selective query routing regardless of its architecture and topology. To put it simply, you cannot beat these results.

   But there are plenty of ways to never reach them.

5.2. Optimal search conditions.

   The optimal peer-to-peer search improvement results (42) and (43) can be achieved only if two conditions are met: first, that the mutually cached indices on the hosts accessible by a single search will not overlap, and second, that the bandwidth used to transfer the indices will not significantly decrease the bandwidth used for transferring the search queries. Otherwise, the search performance will degrade accordingly.

   The first condition is easy to meet - or at least, to make the index overlap negligibly small. One way of doing so would be to increase the number of connections (outdegree) of the average Gnutella node, which, by the way, would also fulfill the recommendation of study [9] that suggests that having the bigger average peer outdegree would reduce the average response path length, reducing the bandwidth consumed by the responses. Increasing outdegree will also reduce the average search latency.

   Here, however, we are mainly concerned with the fact that increasing the outdegree will tilt the histogram of responses vs hop number (distance from the query originator node) towards the last hop. If the average Gnutella host has four connections, the search query would be multiplied as 1-3-9-27-81-... during the search like shown on Figure 3 and we can expect about 2/3 of all the responses to arrive from the last-hop nodes, whereas 1/3 of the responses would arrive from the 'inner' nodes and would largely consist of the duplicate results. Further, even the last-hop responses would contain at least 1/4 of the duplicates - on Figure 3, for example, hosts a and e both receive the same indices to cache through the connections from the 'parent' host d.

   Increasing the number of host connections to ten will give us the query multiplication 1-9-81-..., and only 1/9 of the results will arrive from the 'inner' nodes. Similarly, only 1/10 of the last-hop host cached indices will be received from their 'parent' hosts, whereas 9/10 of the data will be unique. Of course, some index overlap will always exist. Also, increasing the average number of connections on the existing network might be inconvenient for a variety of reasons.

   So another approach to minimizing the mutually cached index overlap is to use separate and independent networks for the index transfers and for the search query delivery. For example, the same host might open two times more connections and use the new ones for index transfers only - no queries will go across them. These new connections are represented by the dotted lines on Figure 14 - for simplicity, it shows the index transfer connections and the index transfer network only for the host d.

Figure 14. Separate networks for querying and index transfers.

   When the host g issues the search query, it goes to d, which retransmits it to a and e only, but not to k and j, which are on the different logical network, as far as the host d is concerned. Similarly, the host d stores the data indices for k, j, l, m, and so on, received over the dotted-line connections, but not for the hosts g, a, and e. This approach allows to achieve the negligibly small overlap of the mutually cached indices when the network size is sufficiently large and the connection establishment is randomized (see the assumptions in Section 2.1 and Section 2.2).

   Other approaches to minimize the overlap are also possible - the study [10], for example, suggests using a 'Local Indices' algorithm that selectively searches only the hosts separated from the query originator by a predetermined number of hops. This algorithm, though, underutilizes the potential capabilities of the hosts when they only route the query message without doing a local search. It won't be analyzed here in detail, since it is probably not optimal because of this waste of the network resources.

5.3. Index transfer bandwidth.

   The issue of the bandwidth spent to transfer the mutually cached indices between hosts is more complicated. Of course, in the Gnutella example of a mutual caching with a depth of one from the previous section it is not an issue at all. Even a modem host with 5,000-bytes per second bandwidth can transfer four 20-kilobyte indices to and from its immediate neighbours in just 16 seconds, which is less than one percent of the total one-hour average session time. So the search performance degradation caused by this transfer will be negligible.

   But so will be the reward - five times better search query reach might sound impressive only until it is compared with the potential improvement from the other strategies - like massively redundant superpeers analyzed in Section 4.5.

   Obviously much better search performance can be achieved if the hosts cache much more data than the indices of their immediate neighbours. And as the hosts transfer more index data, the bandwidth consumed by these index transfers starts to play a more significant role in the overall system search performance. The resulting performance decrease will be similar to the gradual performance decrease shown on Figure 12 for the k-redundant superpeer network. If the average session length can be increased at will, it is easy to make the bandwidth consumed by the index transfers arbitrarily low. In this case the search performance can approach its theoretical peak defined by (42), (43) and shown on Figure 13.

   Of course, even with an infinite session length some bandwidth will be consumed by the index updates that have to be propagated through the index transfer network (dotted lines on Figure 14) when the local indices on the hosts are changed. This happens when the new content is downloaded on the hosts, when the unused files are deleted by the users, and so on. The study [9], however, suggests that this index update bandwidth is much lower than the index transfer bandwidth necessary for the initial index distribution when the host joins the network. In this study we won't be paying attention to the index update bandwidth, concentrating only on the initial index transfer one.

   So the simplest way to approach the peak search performance is to increase the average session length. Unfortunately, however, many factors that determine the average session length are out of the developer's control. As we'll see later, the session lengths that are typical for the peer-to-peer applications today (see Section 2.5 and assumption (1)) produce the index transfer bandwidth that can make the actual search performance significantly lower than its theoretical maximum. In the ad hoc networks it is a