Difference between revisions of "LOCKSS: Polling and Repair Protocol"

From CLOCKSS Trusted Digital Repository Documents
Jump to: navigation, search
(Initial version)
 
(Initial version)
Line 1: Line 1:
 
= LOCKSS: Polling and Repair Protocol =
 
= LOCKSS: Polling and Repair Protocol =
 +
 +
== Overview ==
  
 
As described in [[CLOCKSS: Box Operations]] the CLOCKSS boxes are configured to form a Private LOCKSS Network (PLN). The boxes run the LOCKSS polling and repair protocol as described in [http://dx.doi.org/10.1145/1047915.1047917 our ''ACM Transactions on Computer Systems'' paper] with the following modifications:
 
As described in [[CLOCKSS: Box Operations]] the CLOCKSS boxes are configured to form a Private LOCKSS Network (PLN). The boxes run the LOCKSS polling and repair protocol as described in [http://dx.doi.org/10.1145/1047915.1047917 our ''ACM Transactions on Computer Systems'' paper] with the following modifications:
Line 22: Line 24:
 
The Mellon-funded work included development of improved instrumentation and analysis software, which polls the administrative Web UI of each LOCKSS box in a network to collect vast amounts of data about the operations of each box. These tools were used on the CLOCKSS network for an initial 59-day period, collecting over 18M data items. The data collected has yet to be fully analyzed but initial analysis shows that the polling process among CLOCKSS boxes continues to operate satisfactorily. Some examples of the graphs generated follow.
 
The Mellon-funded work included development of improved instrumentation and analysis software, which polls the administrative Web UI of each LOCKSS box in a network to collect vast amounts of data about the operations of each box. These tools were used on the CLOCKSS network for an initial 59-day period, collecting over 18M data items. The data collected has yet to be fully analyzed but initial analysis shows that the polling process among CLOCKSS boxes continues to operate satisfactorily. Some examples of the graphs generated follow.
  
[[File:Sample Graph 1.png|200px|thumb|center]] This graph shows the number of AU instances in CLOCKSS boxes which have reached agreement with N other CLOCKSS boxes, showing the progress AUs make after ingest as the [[LOCKSS: Polling and Repair Protocol]] identifies matching AU instances at other boxes. It will be seen that there are few AU instances in the sample with few boxes with whom they have reached agreement, and that the majority of AU instances have reached agreement with AU instances at the majority of other CLOCKSS boxes.
+
[[File:Sample Graph 1.png|200px|thumb|center]] This graph shows the number of AU instances in CLOCKSS boxes which have reached agreement with N other CLOCKSS boxes, showing the progress AUs make after ingest as the LOCKSS: Polling and Repair Protocol identifies matching AU instances at other boxes. It will be seen that there are few AU instances in the sample with few boxes with whom they have reached agreement, and that the majority of AU instances have reached agreement with AU instances at the majority of other CLOCKSS boxes.
  
 
[[File:Sample Graph 2.png|200px|thumb|center]] This graph shows the extent of agreement among the over 40,000 completed polls in the sample. As can be seen, the overwhelming majority of the polls showed complete agreement. Polls with less than complete agreement are likely to have been caused by polling among AU instances that were still collecting content, so had different sub-sets of the URLs in an AU.
 
[[File:Sample Graph 2.png|200px|thumb|center]] This graph shows the extent of agreement among the over 40,000 completed polls in the sample. As can be seen, the overwhelming majority of the polls showed complete agreement. Polls with less than complete agreement are likely to have been caused by polling among AU instances that were still collecting content, so had different sub-sets of the URLs in an AU.

Revision as of 23:32, 26 September 2013

Contents

LOCKSS: Polling and Repair Protocol

Overview

As described in CLOCKSS: Box Operations the CLOCKSS boxes are configured to form a Private LOCKSS Network (PLN). The boxes run the LOCKSS polling and repair protocol as described in our ACM Transactions on Computer Systems paper with the following modifications:

  • Because the CLOCKSS PLN is closed network secured by SSL certificate checks at both ends of all connections, the defenses against sybil attacks, which involve the adversary creating new peer identities, are not necessary and are not implemented.
  • The efficiency enhancements described below are being deployed to the CLOCKSS PLN.

The LOCKSS polling and repair protocol performs regular integrity checks on each AU at each CLOCKSS box (the poller) by:

  • Selecting a random sample of the other CLOCKSS boxes (the voters).
  • Inviting the voters to participate in a poll on the AU.
  • The poll involves the voters voting, using a procedure based on nonced cryptographic hashes, on the content of each URL in their copy of the AU.
  • In tallying the votes, the poller may detect that:
    • A URL it has does not match the consensus of the voters, or
    • A URL that the consensus of the voters says should be present in the AU is missing from the poller's AU, or
    • A URL it has does not match the checksum generated when it was stored.
  • If so, it repairs the problem by:
    • requesting a new copy from one of the voters that agreed with the consensus,
    • then verifying that the new copy does agree with the consensus.

Enhancements

The LOCKSS team's internal monitoring and evaluation processes identified some areas in which the efficiency of the polling process could be improved in the context of the Global LOCKSS Network (GLN). The Andrew W. Mellon Foundation funded work to implement and evaluate improvements in these areas. This is expected to be complete by March 2014. Although these improvements will be deployed to the CLOCKSS network, because there are many fewer boxes in the CLOCKSS network than the GLN the areas of inefficiency are less relevant to the CLOCKSS network. Thus the improvements are not expected to make a substantial difference to the performance of the CLOCKSS network.

The Mellon-funded work included development of improved instrumentation and analysis software, which polls the administrative Web UI of each LOCKSS box in a network to collect vast amounts of data about the operations of each box. These tools were used on the CLOCKSS network for an initial 59-day period, collecting over 18M data items. The data collected has yet to be fully analyzed but initial analysis shows that the polling process among CLOCKSS boxes continues to operate satisfactorily. Some examples of the graphs generated follow.

Sample Graph 1.png
This graph shows the number of AU instances in CLOCKSS boxes which have reached agreement with N other CLOCKSS boxes, showing the progress AUs make after ingest as the LOCKSS: Polling and Repair Protocol identifies matching AU instances at other boxes. It will be seen that there are few AU instances in the sample with few boxes with whom they have reached agreement, and that the majority of AU instances have reached agreement with AU instances at the majority of other CLOCKSS boxes.
Sample Graph 2.png
This graph shows the extent of agreement among the over 40,000 completed polls in the sample. As can be seen, the overwhelming majority of the polls showed complete agreement. Polls with less than complete agreement are likely to have been caused by polling among AU instances that were still collecting content, so had different sub-sets of the URLs in an AU.

Change Process

Changes to this document require:

  • Review by LOCKSS Engineering Staff
  • Approval by LOCKSS Chief Scientist

Relevant Documents

  1. CLOCKSS: Box Operations
  2. Petros Maniatis, Mema Roussopoulos, TJ Giuli, David S.H. Rosenthal, Mary Baker, and Yanto Muliadi. “LOCKSS: A Peer-to-Peer Digital Preservation System”, ACM Transactions on Computer Systems vol. 23, no. 1, February 2005, pp. 2-50. http://dx.doi.org/10.1145/1047915.1047917 accessed 2013.8.7