github IQSS/dataverse v4.9

latest releases: v6.2, v6.1, v6.0...
5 years ago

Note: We recommend upgrading to 4.9.1, which includes a patch to address a high impact bug. Learn more in the 4.9.1 Release Notes.

This release introduces new features, File PIDs and Provenance. A new metrics API has been included. We have updated the Solr version used for search, improved error handling for file upload, fixed memory leaks in Export and added several more useful APIs: move dataverse, link dataset and dataverse, and uningest a tabular data file. Numerous bug fixes and documentation improvements have been made.

  • File PIDs
  • Provenance
  • Metrics API
  • Update Solr to v7.3
  • Move Dataverse API
  • Link Dataset and Dataverse APIs
  • Uningest tabular file
  • Make file upload more robust by improving error handling
  • Fix memory leak in Export
  • Fix issues with contact us email, make from address Dataverse server, reply to address requestor
  • Change the way DOIs and Handles are stored in the database to be more flexible with respect to format.
  • Add Mixtepec Mixtec language to metadata list of languages.
  • Make metadata URLs clickable, ie. Alternative URL

For the complete list of issues, see the 4.9 milestone in Github.

For help with upgrading, installing, or general questions please email support@dataverse.org.

Installation:

If this is a new installation, please see our Installation Guide.

Upgrade:

If you are upgrading from v4.x, you must upgrade to each intermediate version before installing this version.
This release has a number of extra steps, most notably upgrading Solr, migrating DOIs to a new storage format, and reindexing. This will require a brief downtime and a period of incomplete search records as the index rebuilds, post Solr upgrade. It is strongly recommended you test this upgrade in a test environment and back up your database before deploying to production.

When upgrading from the previous version, you will need to do the following:

  1. Shut down access to production service, you do not want users interacting with site during upgrade.
  2. Undeploy current version of Dataverse from each web server.
  • <glassfish install path>/glassfish4/bin/asadmin list-applications
  • <glassfish install path>/glassfish4/bin/asadmin undeploy dataverse
  1. Stop glassfish and remove the generated directory, restart glassfish
    • service glassfish stop
    • remove the generated directory: rm -rf <glassfish install path>glassfish4/glassfish/domains/domain1/generated
    • service glassfish start
  2. Back up production database
  3. Install and configure Solr v7.3
    See http://guides.dataverse.org/en/4.9/installation/prerequisites.html#installing-solr
  4. Deploy v4.9 to web servers
  • <glassfish install path>/glassfish4/bin/asadmin deploy <path>dataverse-4.9.war
  1. Upgrade the database by running the update script.
    Once again, we STRONGLY RECOMMEND taking a full backup of the database before proceeding with the upgrade. Among other changes in this release, we are rearranging the way DOI identifiers are stored in the database. While your existing persistent identifiers stay the same (as the name suggests!), the update script will modify the database entries (it has to do with how the "authority" and "shoulder" suffix are stored). And since we are modifying something as important as the identifiers of your datasets, it's a great idea to have a handy way to restore your database as it was, in the unlikely event anything goes wrong.
    pg_dump --clean <db name> is a good way to save the entire database as an importable .sql file.
    Run the upgrade script:
  • psql -U <db user> -d <db name> -f upgrade_v4.8.6_to_v4.9.0.sql
  1. (Optionally) Enable Provenance
curl -X PUT -d 'true' http://localhost:8080/api/admin/settings/:ProvCollectionEnabled

  1. Update metadata languages list
curl http://localhost:8080/api/admin/datasetfield/load -X POST --data-binary @citation.tsv -H "Content-type: text/tab-separated-values"
  1. Restart glassfish
  2. Clear index, then index all metadata
curl http://localhost:8080/api/admin/index/clear
curl http://localhost:8080/api/admin/index

Please note: Do not run the registerDataFileAll command below if you do not plan to give your files persistent identifiers, which are no longer required in 4.9.3 or later (#4929).

  1. Run the retroactive file PID registration script or register all file PID endpoint
    Note: if you have a large amount of files being registered, you may want to contact your doi provider in advance to determine whether this level of traffic will cause a problem for their service.
curl http://localhost:8080/api/admin/registerDataFileAll?key=<super user api token>

This utility logs progress to server.log and a completion message with a total and any failures.
12. When file registration completes, perform in-place reindex.

curl -X DELETE http://localhost:8080/api/admin/index/timestamps
curl http://localhost:8080/api/admin/index/continue

If you are upgrading from v3.x, you will need to perform a migration to v4.x since our application was redesigned and the database schema are completely different. This is a significant undertaking. Please contact us (support at dataverse.org) before beginning. Also refer to our migration google group for additional support and information: https://groups.google.com/forum/#!forum/dataverse-migration-wg

IMPORTANT: If you are running TwoRavens as part of your Dataverse installation:
Make sure the two applications are using the same version of the "pre-processed statistics" R code. Compare the 2 files:
On the TwoRavens side:
.../dataexplore/rook/preprocess/preprocess.R
On the Dataverse side:
.../applications/dataverse-4.9/WEB-INF/classes/edu/harvard/iq/dataverse/rserve/scripts/preprocess.R

If they are different, replace the Dataverse copy with the TwoRavens copy (i.e., the TwoRavens version wins!).
And, also, remove all the already-generated pre-processed fragments in your Dataverse file directory, for example:

cd [files directory]
rm -f `find . -name '*.prep'`

If the two copies are the same, you don't need to do any of this.
Please note that this is a temporary measure, we are working on a fix that will make the two applications resolve code version conflicts like this automatically.

Don't miss a new dataverse release

NewReleases is sending notifications on new releases.