-
Notifications
You must be signed in to change notification settings - Fork 2
Getting statistics on Usage
By running some commands on the Rails console or Rake tasks on the server, you can fetch some statistics about usage. These commands are intended for developers and the Repository Manager, not for general users. Note that in the descriptions below, "project" == "BatchContext" in our code. Basically, when you fill in the form in pre-assembly, you are create a new "BatchContext", which represents an accessioning project. A single project can then spawn both discovery reports and pre-assembly jobs using those parameters (sometimes multiple jobs if there are errors the first time). So a single project produces one or more jobs.
The following need to be run on the server on the Rails console, so SSHing into the server and start the console:
ssh [email protected]
cd pre-assembly/current
bundle exec rails c -e production
Total number of unique projects (i.e. batch_contexts, includes all pre-assembly and discovery report derived jobs):
BatchContext.count
Total number of Job Runs (both pre-assembly and discovery report):
JobRun.count
Total number of pre-assembly jobs:
JobRun.where(job_type: 'preassembly').count
Total number of discovery report jobs:
JobRun.where(job_type: 'discovery_report').count
Total number of projects that used a file manifest:
BatchContext.where(using_file_manifest: true).count
Total number of projects that did not use a file manifest:
BatchContext.where(using_file_manifest: false).count
Total number of distinct users who have created projects:
User.count
Number of projects by user, descending:
BatchContext.joins(:user).group(:sunet_id).count.sort_by {|_key, value| value}.to_h
Number of projects by user in a time frame, descending
BatchContext.where("batch_contexts.updated_at > ?", DateTime.now.utc - 1.years).joins(:user).group(:sunet_id).count.sort_by {|_key, value| value}.reverse!
List users that make use of file_manifest.csv
and how many projects have they created that use one:
BatchContext.where(using_file_manifest: true).joins(:user).group(:sunet_id).count.sort_by {|_key, value| value}.to_h
Number of projects by user, in a year increment:
year = 2022
BatchContext.joins(:user).group(:sunet_id).where('batch_contexts.created_at > ? and batch_contexts.created_at < ?',Time.zone.parse("#{year}/01/01"),Time.zone.parse("#{year}/12/31")).count.sort_by {|_key, value| value}.to_h
The following are run as a rake task, so NOT on the Rails console, but on the server itself in the same directory as the app is installed.
It iterates over all discovery report jobs in the database, and then outputs statistics on all jobs which have JSON reports still available on disk. The output is a CSV file with the following headers: num_objects, num_files, num_errors, runtime_minutes, user, report_date
ssh [email protected]
cd pre-assembly/current
RAILS_ENV=production bundle exec rake reports:discovery
less tmp/discovery_report_stats.csv
- Getting started
- Deposit workflow overview
- Content staging
- Using Globus to stage files
- Filling out the Preassembly web form
- Running the Discovery Report and Preassembly Jobs
- Updating existing items
- Accessioning complex content
- Accessioning images with captions
- Explanation of possible errors found by a discovery report
- What Happens After My Job Completes?
- My Job Seems to Be Taking A Really Long Time (like... days!)
- My files did not show up on the PURL as expected
- Using preassembly for self-deposited content