This draft also defines a Diagnostics Usage, which can be used to obtain diagnostic information about a peer in the overlay. The Diagnostics Usage is interesting both to administrators monitoring the overlay as well as to some overlay algorithms that base their decisions on capabilities and current load of nodes in the overlay. defines a diagnostic usage for obtaining information about node performance.
The Diagnostic Usage allows a node to report various statistics about itself that may be useful for diagnostics or performance management. It can be used to discover information such as the software version, uptime, routing table, stored resource-objects, and performance statistics of a peer. The usage defines several new kinds which can be retrieved to get the statistics and also allows to retrieve other kinds that a node stores. In essence, the usage allows querying a node's state such as storage and network to obtain the relevant information. Additional diagnostic capabilities have been proposed in . The access control model for all kinds is a local policy defined by the peer or the overlay policy. The peer may be configured with a list of users that it is willing to return the information for and restrict access to users with that name. Unless specific policy overrides it, data SHOULD NOT be returned for users not on the list. The access control can also be determined on a per kind basis - for example, a node may be willing to return the software version to any users while specific information about performance may not be returned. TODO - need to explain how this is addressed to node-id. [TODO: Do we need a DIAGNOSTIC method? Access control mechanisms for DIAGNOSTIC may be different from a Fetch.] The following kinds are defined: A single value element containing an unsigned 32-bit integer representing the number of peers in the peer's routing table. A single value element containing a US-ASCII string that identifies the manufacture, model, and version of the software. A single value element containing an unsigned 64-bit integer specifying the time the nodes has been up in seconds. A single value element containing an unsigned 64-bit integer specifying the time the p2p application has been up in seconds. A single value element containing an unsigned 32-bit integer representing the memory footprint of the peer program in kilo bytes. What's a kilo byte? 1000 or 1024? -- Cullen Good question. 1000 seems like not quite enough room but 1024 is too much? -- EKR An unsigned 64-bit integer representing the number of bytes of data being stored by this node. An array element containing the number of instances of each kind stored. The array is index by Kind-ID. Each entry is an unsigned 64-bit integer. An array element containing the number of messages sent and received. The array is indexed by method code. Each entry in the array is a pair of unsigned 64-bit integers (packed end to end) representing sent and received. A single value element containing an unsigned 32-bit integer representing an exponential weighted average of bytes sent per second by this peer. sent = alpha x sent_present + (1 - alpha) x sent where sent_present represents the bytes sent per second since the last calculation and sent represents the last calculation of bytes sent per second. A suitable value for alpha is 0.8. This value is calculated every five seconds. A single value element containing an unsigned 32-bit integer representing an exponential weighted average of bytes received per second by this peer. Same calculation as above. [[TODO: We would like some sort of bandwidth measurement, but we're kind of unclear on the units and representation.]]
(OPEN QUESTION: any other metrics?) Below, we sketch how these metrics can be used. A peer can use EWMA_BYTES_SENT and EWMA_BYTES_RCVD of another peer to infer whether it is acting as a media relay. It may then choose not to forward any requests for media relay to this peer. Similarly, among the various candidates for filling up routing table, a peer may prefer a peer with a large UPTIME value, small RTT, and small LAST_CONTACT value.