Difference between revisions of "CLOCKSS: Logging and Records"

From CLOCKSS Trusted Digital Repository Documents
Jump to: navigation, search
m (Change Process)
(restoring post-2016 edits)
 
(3 intermediate revisions by one user not shown)
Line 2: Line 2:
  
 
The CLOCKSS system uses three types of record:
 
The CLOCKSS system uses three types of record:
* '''Logs:''' detailed logs, at an extensively customizable level of the operations of the LOCKSS daemon written to log files on the host machine. The purpose of these logs is to enable diagnosis of problems that arise. Logs are retained on the machine that generated them in <tt>/var/log</tt>.
+
* '''Logs:''' detailed logs, at an extensively customizable level of the operations of the [[LOCKSS: Basic Concepts#LOCKSS Daemon|LOCKSS daemon]] written to log files on the host machine. The purpose of these logs is to enable diagnosis of problems that arise. Logs are retained on the machine that generated them in <tt>/var/log</tt>.
* '''Alerts:''' messages sent off-machine by the LOCKSS daemon when significant events occur. The purpose of Alerts is to draw attention to potential problems that may need diagnosis. Alerts are sent via e-mail to the <<tt>clockss-alerts@clockss.org</tt>> mail alias, and added to the log files on the host machine via the <tt>syslog</tt> mechanism.
+
* '''Alerts:''' messages sent off-machine by the LOCKSS daemon when significant events occur. The purpose of Alerts is to draw attention to potential problems that may need diagnosis. Alerts are sent via e-mail to the <<tt>clockss-alerts</tt>> mail alias, and added to the log files on the host machine via the <tt>syslog</tt> mechanism.
 
* '''Records:''' statistical summaries and business records of the operation of the system as a whole, not of individual boxes. They are provided to the CLOCKSS board and CLOCKSS member organizations, and electronic copies are being stored in a system run by the Executive Director.
 
* '''Records:''' statistical summaries and business records of the operation of the system as a whole, not of individual boxes. They are provided to the CLOCKSS board and CLOCKSS member organizations, and electronic copies are being stored in a system run by the Executive Director.
  
 
== Retention Policy ==
 
== Retention Policy ==
  
Although the LOCKSS daemon can generate extremely detailed logs, doing so routinely is counter-productive. It buries the signal in the noise. The goal of the logging and record policy, in the absence of a specific problem to diagnose, is to:
+
Although the [[LOCKSS: Basic Concepts#LOCKSS Daemon|LOCKSS daemon]] can generate extremely detailed logs, doing so routinely is counter-productive. It buries the signal in the noise. The goal of the logging and record policy, in the absence of a specific problem to diagnose, is to:
 
* Generate Logs adequate to, and retain them long enough to, enable simple diagnosis.
 
* Generate Logs adequate to, and retain them long enough to, enable simple diagnosis.
 
* Generate Alerts on any condition that the daemon determines is anomalous, and on other significant events, with sufficient detail to draw the system administrator's attention to problems requiring diagnosis, and to retain them indefinitely.
 
* Generate Alerts on any condition that the daemon determines is anomalous, and on other significant events, with sufficient detail to draw the system administrator's attention to problems requiring diagnosis, and to retain them indefinitely.
 
* Generate the Records needed for business and governance, and for monitoring of the CLOCKSS network's overall performance, and to retain them indefinitely.
 
* Generate the Records needed for business and governance, and for monitoring of the CLOCKSS network's overall performance, and to retain them indefinitely.
 
Specific log retention policies for each CLOCKSS box are specified in <tt>/etc/logrotate.conf</tt> and the files in <tt>/etc/logrotate.d/</tt>. On each CLOCKSS Box:
 
Specific log retention policies for each CLOCKSS box are specified in <tt>/etc/logrotate.conf</tt> and the files in <tt>/etc/logrotate.d/</tt>. On each CLOCKSS Box:
** System logs are retained for a month.
+
* System logs are retained for a month.
** At least the most recent 20MB of LOCKSS daemon log data is retained.
+
* At least the most recent 20MB of LOCKSS daemon log data is retained.
  
 
== Ingest Alerts ==
 
== Ingest Alerts ==
Line 21: Line 21:
 
<pre>
 
<pre>
 
   Date: Sat 19 Feb 2011 04:17:24 PST
 
   Date: Sat 19 Feb 2011 04:17:24 PST
   From: LOCKSS box ingest2.clockss.org <clockss-alert@lockss.org>
+
   From: LOCKSS box ingest2.clockss.org <clockss-alert@xxx.xxx>
 
   Subject: [lockss-alert] LOCKSS box info: CrawlEnd
 
   Subject: [lockss-alert] LOCKSS box info: CrawlEnd
  
Line 32: Line 32:
  
 
   Crawl ended successfully, 2276 new files, 4 warnings.
 
   Crawl ended successfully, 2276 new files, 4 warnings.
 +
</pre>
 +
Here is an example failed crawl alert from an ingest box:
 +
<pre>
 +
From: LOCKSS box ingest1.clockss.org <clockss-alert@xxx.xxx>
 +
To: clockss-alert@xxx.xxx
 +
Date: Thu 20 Mar 2014 21:50:18 PDT
 +
Subject: [clockss-alert] LOCKSS box warning: CrawlFailed
 +
 +
LOCKSS box 'ingest1.clockss.org' raised an alert at Thu Mar 20 21:45:18 PDT 2014
 +
 +
Name: CrawlFailed
 +
Severity: warning
 +
AU: Journal of Pharmacology and Experimental Therapeutics Volume 346
 +
Explanation: Crawl finished with error: Can't fetch permission page: 0 files fetched, 0 warnings, 1 error
 
</pre>
 
</pre>
  
Line 40: Line 54:
 
** A repair was needed because the content failed to match the consensus.
 
** A repair was needed because the content failed to match the consensus.
 
** Repair content was fetched.
 
** Repair content was fetched.
** The repair content matched the consensus.
+
** The repair content failed to match the consensus.
 
* If there were a non-zero number of URL version newly flagged as suspect because their content failed to match the locally stored hash.
 
* If there were a non-zero number of URL version newly flagged as suspect because their content failed to match the locally stored hash.
An example of such an Alert:
+
Here is an example alert caused by injection of a failure of a repair to match the consensus during testing in the STF test environment:
 
<pre>
 
<pre>
  Date: Sat Jul 20 2013 04:17:24 PST
+
From: LOCKSS box quark <xxx@xxx.xxx>
  From: LOCKSS box ingest2.clockss.org <clockss-alert@lockss.org>
+
To: clockss-alert@xxx.xxx
  Subject: [lockss-alert] LOCKSS box info: PollEnd
+
Date: Thu 20 Mar 2014 22:50:33 PDT
 +
Subject: LOCKSS box warning: PersistentDisagreement
  
  LOCKSS box 'ingest2.clockss.org' raised an alert at Sat Jul 20 04:12:24 PST 2013
+
LOCKSS box 'quark' raised an alert at Thu Mar 20 22:50:33 PDT 2014
  
  Name: PollEnd
+
Name: PersistentDisagreement
  Severity: info
+
Severity: warning
  AU: Nature Reviews Genetics Volume 11
+
AU: Simulated Content: simContent
  Explanation: Poll ended successfully: 99.89% agreement
+
AUID: org|lockss|plugin|simulated|SimulatedPlugin&root~simContent
 +
Explanation: Poll did not achieve consensus on all files
  
  Poll ended successfully, 2866 URLs, 99.89% agreement, 3 suspect files found.
+
21 URLs tallied, 95.23% agreement
 +
1 repair received, 0 not received.
 +
 
 +
1 repair didn't resolve disagreement:
 +
http://www.example.com/003file.txt
 
</pre>
 
</pre>
  
 
== Dissemination Alerts ==
 
== Dissemination Alerts ==
  
The CLOCKSS archive is a dark archive; access to the content is permitted only at the direction of the CLOCKSS board. Thus, as described in [[CLOCKSS: Box Operations]], the content access mechanisms of the LOCKSS daemon are disabled, and packet filters are used to further prevent access. Nevertheless, Alerts are generated on any access to the content in order that they may be treated as [[CLOCKSS: Logging and Records#Administrative and Security Alerts|Security Alerts]].
+
The CLOCKSS archive is a dark archive; access to the content is permitted only at the direction of the CLOCKSS board. Thus, as described in [[CLOCKSS: Box Operations]], the content access mechanisms of the [[LOCKSS: Basic Concepts#LOCKSS Daemon|LOCKSS daemon]] are disabled, and packet filters are used to further prevent access. Nevertheless, Alerts are generated on any access to the content in order that they may be treated as [[CLOCKSS: Logging and Records#Administrative and Security Alerts|Security Alerts]].
 +
 
 +
Here is a sample access alert from an ingest box. These accesses are expected as they come from production boxes crawling the ingest box; the alerts were turned on briefly as a test but would normally be disabled.
 +
<pre>
 +
From: LOCKSS box ingest1.clockss.org <clockss-alert@xxx.xxx>
 +
To: clockss-alert@xxx.xxx
 +
Date: Sun 05 Jan 2014 08:42:42 PST
 +
Subject: LOCKSS box info: ContentAccess (multiple)
 +
 
 +
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:31 PST 2014
 +
 
 +
Name: ContentAccess
 +
Severity: info
 +
Explanation: Proxy access: http://www.nature.com/rj/style/group.css : 200 from cache in 398ms
 +
 
 +
==========================================================================
 +
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:33 PST 2014
 +
 
 +
Name: ContentAccess
 +
Severity: info
 +
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n3/abs/nbt.1829.html : 200 from cache in 243ms
 +
 
 +
==========================================================================
 +
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:33 PST 2014
 +
 
 +
Name: ContentAccess
 +
Severity: info
 +
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n3/abs/nbt.1829.html : 200 from cache in 1ms
 +
 
 +
==========================================================================
 +
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:33 PST 2014
 +
 
 +
Name: ContentAccess
 +
Severity: info
 +
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n3/abs/nbt.1829.html : 200 from cache in 1ms
 +
 
 +
==========================================================================
 +
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:33 PST 2014
 +
 
 +
Name: ContentAccess
 +
Severity: info
 +
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n3/abs/nbt.1829.html : 200 from cache in 0ms
 +
 
 +
==========================================================================
 +
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:34 PST 2014
 +
 
 +
Name: ContentAccess
 +
Severity: info
 +
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n9/covers/index.html : 200 from cache in 157ms
 +
 
 +
==========================================================================
 +
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:34 PST 2014
 +
 
 +
Name: ContentAccess
 +
Severity: info
 +
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n9/covers/index.html : 200 from cache in 0ms
 +
 
 +
==========================================================================
 +
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:35 PST 2014
 +
 
 +
Name: ContentAccess
 +
Severity: info
 +
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n9/covers/index.html : 200 from cache in 0ms
 +
 
 +
==========================================================================
 +
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:36 PST 2014
 +
 
 +
Name: ContentAccess
 +
Severity: info
 +
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n9/covers/index.html : 200 from cache in 0ms
 +
</pre>
  
 
== Administrative and Security Alerts ==
 
== Administrative and Security Alerts ==
Line 83: Line 173:
 
=== External Reports ===
 
=== External Reports ===
  
The technology for generating reports is being revised; the earlier technology became too inefficient as the number of articles on each box grew because it generated reports on each box from the [[LOCKSS: Metadata Database]] then merged them. The new technology is a centralized database with a row for each article, a column for each of the production and ingest boxes, and the cell containing the ingest timestamp of the article on that box, obtained by a regular polling process that asks each box for the articles ingested since the last time it was asked.
+
The technology for generating reports is being revised; the current technology is becoming too inefficient as the number of articles on each box grows because it generates reports on each box from the [[LOCKSS: Metadata Database]] then merges them. The new technology is a composite database with data synchronized from one or more preservation boxes' metadata databases. In addition to tracking individual article metadata, the consolidated database will track a per-machine ingest timestamp of the article on that box and will support de-duplication based on explicitly-defined rules.
  
 
The following reports are generated for external consumption:
 
The following reports are generated for external consumption:
 
* Monthly reports of the state of preservation of all serials committed to preservation in the CLOCKSS archive are delivered to the CLOCKSS board, the Keepers Registry and posted [http://www.clockss.org/keepers/ on the Web].
 
* Monthly reports of the state of preservation of all serials committed to preservation in the CLOCKSS archive are delivered to the CLOCKSS board, the Keepers Registry and posted [http://www.clockss.org/keepers/ on the Web].
* KBART reports, used to update link resolver knowledge bases, are generated monthly and posted [http://www.clockss.org/kbart/ on the Web].
+
* KBART reports are generated monthly and posted [http://www.clockss.org/kbart/ on the Web]. For the Global LOCKSS Network, these reports are used to update link resolver knowledge bases so that libraries can provide their readers access to the content of their LOCKSS box. Because the CLOCKSS archive is a dark archive, these reports cannot be used to update link resolvers. However, several analysis tools use KBART as an input format, so the KBART reports for CLOCKSS are made public.
 
* The CLOCKSS Executive Director is sent an e-mail report of the article counts in the CLOCKSS archive weekly. These reports are preserved in Stanford's backup system.
 
* The CLOCKSS Executive Director is sent an e-mail report of the article counts in the CLOCKSS archive weekly. These reports are preserved in Stanford's backup system.
* The CLOCKSS archive charges publishers a small fee for each current article ingested, billed quarterly. Thus a quarterly report is generated showing for each publisher the number of their articles ingested in that quarter for each publication year. The report is submitted to the CLOCKSS Executive Director for onward transmission to the publishers. Significant discrepancies between this and the publisher's own article counts will result (and have resulted) in investigation and corrective action. To aid in this process more detailed reports, down to the article level, can be generated on request.
+
* The CLOCKSS Archive charges publishers a small fee for each current article ingested, billed quarterly. Thus a semi-annual report is generated showing for each publisher the number of their articles ingested in that quarter for each publication year. The report is submitted to the CLOCKSS Executive Director for onward transmission to the publishers. Significant discrepancies between this and the publisher's own article counts will result (and have resulted) in investigation and corrective action. To aid in this process more detailed reports, down to the article level, can be generated on request.
  
 
The CLOCKSS Metadata Lead is responsible for the production and dissemination of these reports.
 
The CLOCKSS Metadata Lead is responsible for the production and dissemination of these reports.
Line 97: Line 187:
 
=== Log Monitoring ===
 
=== Log Monitoring ===
  
* Ingest boxes: The CLOCKSS Content Lead is responsible for monitoring logs on the ingest boxes
+
* Ingest boxes: The CLOCKSS Technical Lead is responsible for monitoring logs on the ingest boxes
 
* Production boxes: The CLOCKSS Technical Lead is responsible for monitoring logs on production boxes when needed.
 
* Production boxes: The CLOCKSS Technical Lead is responsible for monitoring logs on production boxes when needed.
 
* Web servers: The CLOCKSS Network Administrator is responsible for monitoring web server logs.
 
* Web servers: The CLOCKSS Network Administrator is responsible for monitoring web server logs.
Line 113: Line 203:
 
== Network Diagnostics ==
 
== Network Diagnostics ==
  
The LOCKSS team's internal monitoring and evaluation processes identified some areas in which the efficiency of the polling process could be improved in the context of the Global LOCKSS Network (GLN). The Andrew W. Mellon Foundation funded work to implement and evaluate improvements in these areas. This is expected to be complete by March 2014. Although these improvements will be deployed to the CLOCKSS network, because there are many fewer boxes in the CLOCKSS network than the GLN the areas of inefficiency are not relevant to the CLOCKSS network. Thus the improvements are not expected to make a substantial difference to the performance of the CLOCKSS network.
+
The LOCKSS team's internal monitoring and evaluation processes identified some areas in which the efficiency of the polling process could be improved in the context of the Global LOCKSS Network (GLN). The Andrew W. Mellon Foundation funded work to implement and evaluate improvements in these areas. This is expected to be complete by March 2015. Although these improvements will be deployed to the CLOCKSS network, because there are many fewer boxes in the CLOCKSS network than the GLN the areas of inefficiency are not relevant to the CLOCKSS network. Thus the improvements are not expected to make a substantial difference to the performance of the CLOCKSS network.
  
 
The Mellon-funded work included development of improved instrumentation and analysis software, which polls the administrative Web UI of each LOCKSS box in a network to collect vast amounts of data about the operations of each box. For examples of the use of this software, see [[LOCKSS: Polling and Repair Protocol#Enhancements|LOCKSS: Polling and Repair Protocol]].
 
The Mellon-funded work included development of improved instrumentation and analysis software, which polls the administrative Web UI of each LOCKSS box in a network to collect vast amounts of data about the operations of each box. For examples of the use of this software, see [[LOCKSS: Polling and Repair Protocol#Enhancements|LOCKSS: Polling and Repair Protocol]].
Line 124: Line 214:
 
* Review by:
 
* Review by:
 
** LOCKSS Engineering Staff
 
** LOCKSS Engineering Staff
** CLOCKSS Network Administrator
+
** CLOCKSS Technical Lead
* Approval by CLOCKSS Technical Lead
+
* Approval by CLOCKSS Network Administrator
  
 
== Relevant Documents ==
 
== Relevant Documents ==

Latest revision as of 22:04, 14 August 2019

Contents

CLOCKSS: Logging and Records

The CLOCKSS system uses three types of record:

  • Logs: detailed logs, at an extensively customizable level of the operations of the LOCKSS daemon written to log files on the host machine. The purpose of these logs is to enable diagnosis of problems that arise. Logs are retained on the machine that generated them in /var/log.
  • Alerts: messages sent off-machine by the LOCKSS daemon when significant events occur. The purpose of Alerts is to draw attention to potential problems that may need diagnosis. Alerts are sent via e-mail to the <clockss-alerts> mail alias, and added to the log files on the host machine via the syslog mechanism.
  • Records: statistical summaries and business records of the operation of the system as a whole, not of individual boxes. They are provided to the CLOCKSS board and CLOCKSS member organizations, and electronic copies are being stored in a system run by the Executive Director.

Retention Policy

Although the LOCKSS daemon can generate extremely detailed logs, doing so routinely is counter-productive. It buries the signal in the noise. The goal of the logging and record policy, in the absence of a specific problem to diagnose, is to:

  • Generate Logs adequate to, and retain them long enough to, enable simple diagnosis.
  • Generate Alerts on any condition that the daemon determines is anomalous, and on other significant events, with sufficient detail to draw the system administrator's attention to problems requiring diagnosis, and to retain them indefinitely.
  • Generate the Records needed for business and governance, and for monitoring of the CLOCKSS network's overall performance, and to retain them indefinitely.

Specific log retention policies for each CLOCKSS box are specified in /etc/logrotate.conf and the files in /etc/logrotate.d/. On each CLOCKSS Box:

  • System logs are retained for a month.
  • At least the most recent 20MB of LOCKSS daemon log data is retained.

Ingest Alerts

An Alert is generated at the end of each crawl of a SIP that meets certain criteria recording the final status of the crawl, and the number of HTTP 200 results obtained (this is equivalent to the number of new URLs which were found, plus the number of existing URLs that were found to have modified content). An example of such an alert:

  Date: Sat 19 Feb 2011 04:17:24 PST
  From: LOCKSS box ingest2.clockss.org <clockss-alert@xxx.xxx>
  Subject: [lockss-alert] LOCKSS box info: CrawlEnd

  LOCKSS box 'ingest2.clockss.org' raised an alert at Sat Feb 19 04:12:24 PST 2011

  Name: CrawlEnd
  Severity: info
  AU: Nature Reviews Genetics Volume 11
  Explanation: Crawl ended successfully: 2276 new files

  Crawl ended successfully, 2276 new files, 4 warnings.

Here is an example failed crawl alert from an ingest box:

From: LOCKSS box ingest1.clockss.org <clockss-alert@xxx.xxx>
To: clockss-alert@xxx.xxx
Date: Thu 20 Mar 2014 21:50:18 PDT
Subject: [clockss-alert] LOCKSS box warning: CrawlFailed

LOCKSS box 'ingest1.clockss.org' raised an alert at Thu Mar 20 21:45:18 PDT 2014

Name: CrawlFailed
Severity: warning
AU: Journal of Pharmacology and Experimental Therapeutics Volume 346
Explanation: Crawl finished with error: Can't fetch permission page: 0 files fetched, 0 warnings, 1 error

Preservation Alerts

An Alert is generated at the end of each poll that detects an integrity problem:

  • If there were a non-zero number of URLs for which:
    • A repair was needed because the content failed to match the consensus.
    • Repair content was fetched.
    • The repair content failed to match the consensus.
  • If there were a non-zero number of URL version newly flagged as suspect because their content failed to match the locally stored hash.

Here is an example alert caused by injection of a failure of a repair to match the consensus during testing in the STF test environment:

From: LOCKSS box quark <xxx@xxx.xxx>
To: clockss-alert@xxx.xxx
Date: Thu 20 Mar 2014 22:50:33 PDT
Subject: LOCKSS box warning: PersistentDisagreement

LOCKSS box 'quark' raised an alert at Thu Mar 20 22:50:33 PDT 2014

Name: PersistentDisagreement
Severity: warning
AU: Simulated Content: simContent
AUID: org|lockss|plugin|simulated|SimulatedPlugin&root~simContent
Explanation: Poll did not achieve consensus on all files

21 URLs tallied, 95.23% agreement
1 repair received, 0 not received.

1 repair didn't resolve disagreement:
http://www.example.com/003file.txt

Dissemination Alerts

The CLOCKSS archive is a dark archive; access to the content is permitted only at the direction of the CLOCKSS board. Thus, as described in CLOCKSS: Box Operations, the content access mechanisms of the LOCKSS daemon are disabled, and packet filters are used to further prevent access. Nevertheless, Alerts are generated on any access to the content in order that they may be treated as Security Alerts.

Here is a sample access alert from an ingest box. These accesses are expected as they come from production boxes crawling the ingest box; the alerts were turned on briefly as a test but would normally be disabled.

From: LOCKSS box ingest1.clockss.org <clockss-alert@xxx.xxx>
To: clockss-alert@xxx.xxx
Date: Sun 05 Jan 2014 08:42:42 PST
Subject: LOCKSS box info: ContentAccess (multiple)

LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:31 PST 2014

Name: ContentAccess
Severity: info
Explanation: Proxy access: http://www.nature.com/rj/style/group.css : 200 from cache in 398ms

==========================================================================
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:33 PST 2014

Name: ContentAccess
Severity: info
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n3/abs/nbt.1829.html : 200 from cache in 243ms

==========================================================================
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:33 PST 2014

Name: ContentAccess
Severity: info
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n3/abs/nbt.1829.html : 200 from cache in 1ms

==========================================================================
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:33 PST 2014

Name: ContentAccess
Severity: info
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n3/abs/nbt.1829.html : 200 from cache in 1ms

==========================================================================
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:33 PST 2014

Name: ContentAccess
Severity: info
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n3/abs/nbt.1829.html : 200 from cache in 0ms

==========================================================================
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:34 PST 2014

Name: ContentAccess
Severity: info
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n9/covers/index.html : 200 from cache in 157ms

==========================================================================
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:34 PST 2014

Name: ContentAccess
Severity: info
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n9/covers/index.html : 200 from cache in 0ms

==========================================================================
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:35 PST 2014

Name: ContentAccess
Severity: info
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n9/covers/index.html : 200 from cache in 0ms

==========================================================================
LOCKSS box 'ingest1.clockss.org' raised an alert at Sun Jan 05 08:12:36 PST 2014

Name: ContentAccess
Severity: info
Explanation: Proxy access: http://www.nature.com/nbt/journal/v29/n9/covers/index.html : 200 from cache in 0ms

Administrative and Security Alerts

Alerts are generated on the following administrative actions:

  • Changes to the configuration files.
  • Changes to the access control permissions.
  • Adding or de-activating an AU.
  • Enabling or disabling the content servers.
  • User account added or removed or password changed.

External Communications

Engagement

Engagement with harvest content publishers before ingestion is described in CLOCKSS; Ingest Pipeline.

Engagement with file transfer content publishers before ingestion is described in CLOCKSS; Ingest Pipeline.

In all cases interactions with the publisher take place through the RT ticketing system, so they are recorded permanently.

External Reports

The technology for generating reports is being revised; the current technology is becoming too inefficient as the number of articles on each box grows because it generates reports on each box from the LOCKSS: Metadata Database then merges them. The new technology is a composite database with data synchronized from one or more preservation boxes' metadata databases. In addition to tracking individual article metadata, the consolidated database will track a per-machine ingest timestamp of the article on that box and will support de-duplication based on explicitly-defined rules.

The following reports are generated for external consumption:

  • Monthly reports of the state of preservation of all serials committed to preservation in the CLOCKSS archive are delivered to the CLOCKSS board, the Keepers Registry and posted on the Web.
  • KBART reports are generated monthly and posted on the Web. For the Global LOCKSS Network, these reports are used to update link resolver knowledge bases so that libraries can provide their readers access to the content of their LOCKSS box. Because the CLOCKSS archive is a dark archive, these reports cannot be used to update link resolvers. However, several analysis tools use KBART as an input format, so the KBART reports for CLOCKSS are made public.
  • The CLOCKSS Executive Director is sent an e-mail report of the article counts in the CLOCKSS archive weekly. These reports are preserved in Stanford's backup system.
  • The CLOCKSS Archive charges publishers a small fee for each current article ingested, billed quarterly. Thus a semi-annual report is generated showing for each publisher the number of their articles ingested in that quarter for each publication year. The report is submitted to the CLOCKSS Executive Director for onward transmission to the publishers. Significant discrepancies between this and the publisher's own article counts will result (and have resulted) in investigation and corrective action. To aid in this process more detailed reports, down to the article level, can be generated on request.

The CLOCKSS Metadata Lead is responsible for the production and dissemination of these reports.

Monitoring

Log Monitoring

  • Ingest boxes: The CLOCKSS Technical Lead is responsible for monitoring logs on the ingest boxes
  • Production boxes: The CLOCKSS Technical Lead is responsible for monitoring logs on production boxes when needed.
  • Web servers: The CLOCKSS Network Administrator is responsible for monitoring web server logs.

Alert Monitoring

The CLOCKSS Technical Lead is responsible for monitoring the Alerts generated by CLOCKSS boxes.

Nagios

The state of the CLOCKSS infrastructure, including the CLOCKSS boxes and the ingest machines, is monitored by Nagios as described in CLOCKSS: Box Operations.

The CLOCKSS Network Administrator is responsible for monitoring via Nagios.

Network Diagnostics

The LOCKSS team's internal monitoring and evaluation processes identified some areas in which the efficiency of the polling process could be improved in the context of the Global LOCKSS Network (GLN). The Andrew W. Mellon Foundation funded work to implement and evaluate improvements in these areas. This is expected to be complete by March 2015. Although these improvements will be deployed to the CLOCKSS network, because there are many fewer boxes in the CLOCKSS network than the GLN the areas of inefficiency are not relevant to the CLOCKSS network. Thus the improvements are not expected to make a substantial difference to the performance of the CLOCKSS network.

The Mellon-funded work included development of improved instrumentation and analysis software, which polls the administrative Web UI of each LOCKSS box in a network to collect vast amounts of data about the operations of each box. For examples of the use of this software, see LOCKSS: Polling and Repair Protocol.

The CLOCKSS Network Administrator is responsible for collecting and analyzing this data.

Change Process

Changes to this document require:

  • Review by:
    • LOCKSS Engineering Staff
    • CLOCKSS Technical Lead
  • Approval by CLOCKSS Network Administrator

Relevant Documents

  1. CLOCKSS: Box Operations
  2. CLOCKSS: Ingest Pipeline
  3. LOCKSS: Polling and Repair Protocol
  4. Definition of AIP