VIVO Release 1 v1.3 Upgrade Guide

July 22, 2011 - Upgrading from Release 1 v1.2 to Release 1 v1.3

Release announcement for V1.3
Upgrade process for V1.3

This document provides a short description of the steps involved in upgrading your installation of VIVO from Version 1.2+ to Version 1.3. This and other documentation can be found on the support page at VIVOweb.org

If you need to do a fresh install, please consult the VIVO Release 1 v1.3 Installation Guide found on vivoweb.org or the install.html file located in the doc directory of the VIVO source code distribution. The installation document also has a list of the required software and versions (there are no new hardware or software requirements for V1.3).

Release Announcement for V1.3

VIVO Release 1.3 incorporates changes to the search indexing, user accounts, menu management, ontology, and visualizations.

Search

VIVO 1.3 will feature notable improvements to the local search, primarily to improve relevance ranking but also to boost the influence of semantic relationships in the search. This will improve recall by including text from related resources (e.g., adding a person's grant and publication titles to his or her search entry) and by boosting overall relevance ranking based on the number and nature of connections from one individual to others.
VIVO is now using Apache Solr (http://lucene.apache.org/solr/) in place of Apache Lucene to improve indexing and faceting of search results. The migration to Solr also aligns the local search with the VIVO multi-site search site under development for release prior to the 2011 VIVO Conference.

Authorization

Release 1.3 provides an entirely new model of authorization within the VIVO application to allow more granular control over system configuration and editing. The first phase of the new user account interface is included in V1.3. This interface provides a user search, a root acount, and password reset functionality where the password gets emailed to the user. The next phase will provide the ability to create new roles.

Menu management

The menus across the top of the site (Home, People, Organizations, Research, Events) can now be managed in a web form instead of editing an RDF file. In addition to making site management much easier, form-based editing also allows more control over what classes of data are displayed and provides a mechanism to limit certain menu pages to content identified as internal to the institution.

FreeMaker template improvements

While less directly visible to the public, version 1.3 also includes additional changes focused directly on supporting open source community involvement in extending and customizing VIVO. The development team began a year ago to transition VIVO's code base away from Java Server Pages to the FreeMarker page templating system that much more cleanly separates internal application programming logic from page display.

Visualization

The visualization team has implemented a Map of Science visualization, which allows users to visually explore the scientific strengths of a university, school, department, or person in the VIVO instance. Users will be able to see where an organization or person's interests lay across 13 major scientific disciplines or 554 sub-disciplines, and will be able to see how these disciplines and sub-disciplines interrelate with one another on the map of science. Wireframes and design documentation for upcoming enhanced versions of the Map of Science visualization have already been developed; the Map of Science visualization will most likely be in the form of a PDF that a user can download.
Several visualization also now provide a caching feature that improves performance after the initial processing.

QR Codes

Pages for people in VIVO have the option of displaying QR codes.

Ontology changes

support for certifications and licenses
expanded support for intellectual property (patents) (it was there as stub before but didn't allow common things such as assignee and issuer)
support for editorial, reviewing and organizing activities
expanded shared geographical instance data vocabulary to include the 50 U.S. states
representing specific types of EducationalTraining: PostdoctoralTraining, Internship, MedicalResidency

Linked open data

Responses to linked data requests have been enhanced to include additional context about any individual, in working toward a goal of being able to provide all the data in a person's profile available as RDF via a single web request.

I. Before Performing the Upgrade

Please ensure that backups are created of the:

Tomcat webapps directory
Original source directory
MySQL database (mysqldump)

The upgrade process is similar to the original install process with the following EXCEPTIONS:

If you are still in RDB mode, it is required that you move your triple store to SDB while still at V1.2 (see Triple Store info below).
DO NOT reinstall MySQL or recreate the MySQL database. Please ensure that you back-up the MySQL database. Also note that VIVO 1.2 will not run on older versions of MySQL that may have worked with 1.1.1. Be sure to run VIVO 1.2 with MySQL 5.1 or higher. Using unsupported versions may result in strange error messages related to table formatting or other unexpected problems.
It is not necessary to add RDF data.
First-time login of the administrator account after the upgrade process is complete will use the password previously set, NOT the default password used on the first login after the initial installation. With V1.3 there is also a new root user. Please see the section on Authorization changes for more information.
The first time Apache Tomcat starts up after the upgrade, it will initiate a process that modifies the knowledge base to align the data with the revised ontology. See the section on the Ontology Upgrade below for more information.

II. Noteworthy Changes

i. Triple store changes

VIVO 1.3 now requires you to use Jena's SPARQL database (SDB) for the triple store technology. Jena's legacy relational database store (RDB) was used by VIVO 1.1.1 and earlier. Both SDB and RDB were available in VIVO 1.2 and 1.2.1. It is required that you move your triple store to SDB while still at V1.2.

SDB mode caches only a fraction of the RDF data in memory. Most queries are issued directly against the underlying database. This allows VIVO installations to display data from large RDF models while requiring only a small amount of server memory to run the application. There is a tradeoff in response time: pages may take slightly longer to load in SDB mode, and performance will depend on the configuration parameters of the database server. Additionally, advanced OWL reasoning (not enabled by default in either mode) is not possible in SDB mode. With SDB, only the default set of inferences (inferred rdf:type statements) are generated, and they are generated as soon as data is edited rather than in a background process.

A conversion from RDB to SDB mode can take a number of hours to complete if the installation contains a large amount of RDF data (roughly a million triples or more). You can start the conversion process in the background while the RDB system is running. This will reduce the delay in initial startup after the application is redeployed with deploy.properties set for SDB. Note that it is important not to edit any data anywhere in the application while this background conversion is running.

To start the SDB conversion, log in as a system administrator and request /sdbsetup (For example, if your VIVO is installed at http://vivo.myuniversity.edu/ you would type http://vivo.myuniversity.edu/sdbsetup into your browser).

Click the button that appears on this page.

During the course of the SDB setup, which may take several hours on a large database, subsequent requests to /sdbsetup will display a message that the operation is still in progress. When a request for this page shows a message that the SDB setup has completed successfully, shut down Tomcat, set deploy.properties to SDB mode, redeploy, and restart Tomcat. VIVO will now be running from the SDB store.

ii. Theme changes

The vivo-basic theme was deprecated with VIVO V1.2 and is no longer present in the V1.3 release as it does not support V1.2 or V1.3 features. It is highly recommended that you use the wilma theme or modify the wilma theme for branding or to create a custom look and feel. Please see the Site Administration Guide for more information about customizing your theme.

iii. Template changes

The ${stylesheets}, ${scripts}, and${headScripts} add() methods now take the full tag as an argument. This will require a change to all calls to these methods in the templates. This change allows for specification of attributes such as media directly in the tag. For example:

1.2: ${stylesheets.add("/css/individual/individual.css")}
1.3: ${stylesheets.add('<link rel="stylesheet" href="${urls.base}/css/individual/individual.css" />')}

Note the inclusion of ${urls.base} in the 1.3 example. The add() method no longer prefixes the context path to the url, so the full url must be specified in the tag.
The addFromTheme() methods of the ${stylesheets},${scripts}, and${headScripts} objects have been deleted. Substitute as shown in the preceding example.
propertyGroups.getPropertyAndRemoveFromList() in the individual templates has been deprecated. The replacement method ispropertyGroups.pullProperty(). There is no change in functionality.

v. List view changes

<query-base> and <query-collated> have been replaced with a single query <query-select> that contains tags for fragments to be used only in the collated version of the query. This and other changes are documented in /vitro/doc/list_view_configuration_guidelines.txt.

v. Authorization changes

In release 1.3, the VIVO authorization system has some extensive changes. In summary, these are:

Each user will have a user account, even if the user logs in by Shibboleth or some other external authentication system.
E-mail is used to notify user's when an account is created for them, or when an administrator edits their account.
A "root" user account exists which has access to all pages and all data fields. This is a powerful tool that can hold some surprises.

a. User Accounts are created for externally authenticated users

With release 1.3, each authenticated user will have a user account. If someone logs in using an external authentication system, and no user account matches their external login credentials, an account will be created.

The user will be prompted to enter information for the account being created: first name, last name, and e-mail address.

b. E-mail address becomes an important part of User Accounts

Prior to release 1.3, each user account was identified by a Username field. This field was labeled as "E-mail address" on some pages in VIVO, but no mail was ever sent. In release 1.3, this has changed, so the e-mail address is fully used, both for identification and for communication with the user.

1. User Account data is restructured

Prior to release 1.3, the Username field (also referred to as 'e-mail address') was used for several purposes:

Idenfiying the user account.
Part of the user's credentials when logging in (along with a password).
Connecting the user account to an external authentication system, like Shibboleth or CUWebAuth.
Connecting the user account to a personal Profile page.

With release 1.3, these functions are handled by two separate fields called EmailAddress field and ExternalAuthId.

EmailAddress is used when logging in (along with a password).
EmailAddress is used to send notifications to the user about changes to his/her account (see below).
The ExternalAuthId is used when logging in using an external authentication system.
The ExternalAuthId is used to connect the user account to a personal Profile page.
Note: With release 1.3, the ExternalAuthId can now be matched against either an untyped literal or a string literal in the Profile page.

There are other changes to the internal structure of the user accounts data, but they are important mostly to the VIVO software developers, and you are not likely to notice them.

2. Existing User Accounts are migrated

If you are upgrading to VIVO release 1.3 from an existing VIVO installation, the user accounts in your system will be migrated into the new data structures. When migrating an account, both the EmailAddress field and the ExternalAuthId field will be set to the value of the Username field in the old account. The new account should behave as the old account did.

When creating a new user account, or editing an existing one, the system requires that your e-mail address be in a valid form, like somebody@somewhere.edu. You should plan for this as part of your migration to release 1.3

3. E-mail is incorporated into the workflow for User Accounts

With release 1.3, VIVO users receive e-mail notifications when an account is created or modified for them or by them.

When an administrator creates a user account, the user will receive an e-mail notification, telling them that the account has been created, and providing a link to VIVO that will allow them to set a password on the account.

Note: when creating the account, the administrator may indicate that it will only be used with an external authentication system like Shibboleth or CUWebAuth. In this case, the account will not require a password, and the e-mail notification message to the user will not provide a password link.

When an administrator edits a user account, he may choose to reset the password. As with a new account, the user will receive notification with a link to VIVO that will allow them to set a new password.

If a user changes the e-mail address on his account, he will receive a notification message to that effect.

If a user account is auto-created for a user with external authentication credentials, the user will receive a notification message.

4. Disabling e-mail notificiation

The e-mail notification relies on two configuration properties:email.smtpHost and email.replyTo. If either of these properties is missing or empty, VIVO will not attempt to send e-mail notifications to users.

This can be useful for small or experimental installations of VIVO, or where e-mail notification is not desired.

If e-mail notifications are disabled, an administrator must set a password on each new account, since the user will have no way of setting it. When the user logs in for the first time, VIVO will require them to change their password to one of their own choosing.

c. Each VIVO installation will have a 'root' account.

Prior to release 1.3, each VIVO installation was created with a default administrator's account. In release 1.3, there is no such account. Instead, each VIVO installation will have a "root" account.

The email address for the root account is specified in deploy.properties, like this:

rootUser.emailAddress = vivo_root@mydomain.edu

The password for this account is automatically set to rootPassword, but you will be required to change the password the first time you log in.

Note: theinitialAdminUser is no longer use.

The root account is not a site administrator's account — it is more powerful than a site administrator's acocunt. The root account is permitted to visit any page in a VIVO application. It is permitted to see any data property, and to enter data into any field. As such, the root account can be very useful and rather dangerous. It can also give you a distorted view of what your VIVO site looks like, since data is shown which other accounts cannot see.

The root account is not intended for routine, every day use. The best way to use the root account is to create a site administrator's account. After that, use the root account only when necessary.

III. The Upgrade Process

1. Download the new distribution file and unpack it into a new source directory.

2. Create a new deploy.properties using the same values as in your previous installation and set values for the new variables as described below (vitro.local.solr.url, vitro.local.solr.ipaddress.mask, vitro.home.directory, email.smptHost, email.replyTo, rootUser.emailAddress)

Property Name	Example Value
Default namespace: VIVO installations make their RDF resources available for harvest using linked data. Requests for RDF resource URIs redirect to HTML or RDF representations as specified by the client. To make this possible, VIVO's default namespace must have a certain structure and begin with the public web address of the VIVO installation. For example, if the web address of a VIVO installation is "http://vivo.example.edu/" the default namespace must be set to "http://vivo.example.edu/individual/" in order to support linked data. Similarly, if VIVO is installed at "http://www.example.edu/vivo" the default namespace must be set to "http://www.example.edu/vivo/individual/" * The namespace must end with "individual/" (including the trailing slash).
Vitro.defaultNamespace	http://vivo.mydomain.edu/individual/
Directory where Vitro code is located. In most deployments, this is set to ./vitro-core (It is not uncommon for this setting to point elsewhere in development environments).
vitro.core.dir	./vitro-core
Directory where tomcat is installed.
tomcat.home	/usr/local/tomcat
Name of your VIVO application.
webapp.name	vivo
URL of Solr context used in local VIVO search. Should consist of: scheme + servername + port + vivo_webapp_name + "solr" In the standard installation, the Solr context will be on the same server as VIVO, and in the same Tomcat instance. The path will be the VIVO webapp.name (specified above) + "solr"
vitro.local.solr.url	http://localhost:8080/vivosolr
Restricts access to the Solr search platform. One or more regular expressions, separated by commas. When a request is made to Solr, the IP address of the requestor must match one of the patterns, or the request will be rejected. Examples: `vitro.local.solr.ipaddress.mask = 127\.0\.0\.1 vitro.local.solr.ipaddress.mask = 127\.0\.0\.1,0:0:0:0:0:0:0:1 vitro.local.solr.ipaddress.mask = 169.254.*`
vitro.local.solr.ipaddress.mask	127\.0\.0\.1,0:0:0:0:0:0:0:1
Directory where the VIVO application will store the data that it creates. This includes uploaded files (usually images) and the Solr search index. Be sure this directory exists and is writable by the user who the Tomcat service is running as.
vitro.home.directory	/usr/local/vivo/data
Specify an SMTP host that the application will use for sending e-mail (Optional). If this is left blank, the contact form will be hidden and disabled, and users will not be notified of changes to their accounts.
email.smtpHost	smtp.servername.edu
Specify an email address which will appear as the sender in e-mail notifications to users (Optional). If a user replies to the notification, this address will receive the reply. If a user's e-mail address is invalid, this address will receive the error notice. If this is left blank, users will not be notified of changes to their accounts.
email.replyTo	vivoAdmin@my.domain.edu
Specify the JDBC URL of your database. Change the end of the URL to reflect your database name (if it is not "vivo").
VitroConnection.DataSource.url	jdbc:mysql://localhost/vivo
Change the username to match the authorized user you created in MySQL.
VitroConnection.DataSource.username	username
Change the password to match the password you created in MySQL.
VitroConnection.DataSource.password	password
Specify the Jena triple store technology to use. SDB is Jena's SPARQL database; this setting allows RDF data to scale beyond the limits of the JVM heap. Set to RDB to use the older Jena RDB store with in-memory caching.
VitroConnection.DataSource.tripleStoreType	SDB
Specify the maximum number of active connections in the database connection pool to support the anticipated number of concurrent page requests. It is not necessary to adjust this value when using the RDB configuration.
VitroConnection.DataSource.pool.maxActive	40
Specify the maximum number of database connections that will be allowed to remain idle in the connection pool. Default is 25% of the maximum number of active connections.
VitroConnection.DataSource.pool.maxIdle	10
Change the dbtype setting to use a database other than MySQL. Otherwise, leave this value unchanged. Possible values are DB2, derby, HSQLDB, H2, MySQL, Oracle, PostgreSQL, and SQLServer. Refer to http://openjena.org/wiki/SDB/Databases_Supported for additional information.
VitroConnection.DataSource.dbtype	MySQL
Specify a driver class name to use a database other than MySQL. Otherwise, leave this value unchanged. This JAR file for this driver must be added to the the webapp/lib directory within the vitro.core.dir specified above.
VitroConnection.DataSource.driver	com.mysql.jdbc.Driver
Change the validation query used to test database connections only if necessary to use a database other than MySQL. Otherwise, leave this value unchanged.
VitroConnection.DataSource.validationQuery	SELECT 1
Specify the email address of the root user account for the VIVO application. This user will have an initial temporary password of 'rootPassword'. You will be prompted to create a new password on first login.
rootUser.emailAddress	vivoAdmin@my.domain.edu
The URI of a property that can be used to associate an Individual with a user account. When a user logs in with a name that matches the value of this property, the user will be authorized to edit that Individual.
selfEditing.idMatchingProperty	http://vivo.mydomain.edu/ns#networkId
The temporal graph visualization can require extensive machine resources. This can have a particularly noticable impact on memory usage if VIVO is configured to use Jena SDB, The organization tree is deep, The number of grants and publications is large. VIVO 1.3 release mitigates this problem by the way of a caching mechanism and hence we can safely set this to be enabled by default.
visualization.temporal	enabled
The temporal graph visualization is used to compare different organizations/people within an organization on parameters like number of publications or grants. By default, the app will attempt to make its best guess at the top level organization in your instance. If you're unhappy with this selection, uncomment out the property below and set it to the URI of the organization individual you want to identify as the top level organization. It will be used as the default whenever the temporal graph visualization is rendered without being passed an explicit org. For example, to use "Ponce School of Medicine" as the top organization: `visualization.topLevelOrg = http://vivo.psm.edu/individual/n2862`
visualization.topLevelOrg	http://vivo.mydomain.edu/individual/topLevelOrgURI

3. Apply any previous changes you have made to the new source directory.

Special notes regarding source files
This process assumes any changes made to the application were made in the source directory and deployed, and were not made directly within the Tomcat webapps directory.

In many cases, simply copying the modified files from your original source directory will not work since the files on which they are based have changed. It will be necessary to inspect the new source files and add any changes to them at that time.
NIH-funded VIVO implementations will need to apply the Google Analytics Tracking Code (GATC) to googleAnalytics.ftl in the theme:
[new_source_directory]/themes/[theme_dir]/templates/googleAnalytics.ftl
A sample googleAnalytics.ftl is included in the built-in theme. This file serves only as an example, and you must replace the tracking code shown with your institution's own tracking code. For additional information about the GATC for the NIH-funded VIVO implementation sites and a copy of your institution's tracking code, see the VIVO Google Analytics wiki page.
If you had used the vivo/contrib/FLShibboleth code in your previous release, you should stop using it. Consult install.html or VIVO Release 1 v1.2 Installation Guide on "Using an External Authentication System with VIVO".

4. If you had modified web.xml to configure the Pellet Reasoner (as described in the installation instructions), repeat that modification.

5. Stop Apache Tomcat and from your VIVO source directory, run ant by typing: ant all

6. Start Apache Tomcat and log in to VIVO.

IV. Ontology Changes

i. Verify Ontology upgrade process

After Apache Tomcat is started, these files should be reviewed to verify that the automated upgrade process was executed successfully. The ontology alignment process will create the following files in the Tomcat webapps/vivo/WEB-INF directory:

ontologies/update/logs/knowledgeBaseUpdate.(timestamp).log: A log of a summary of updates that were made to the knowledge base and notes about some recommended manual reviews. This file should end with "Finished knowledge base migration". If this file contains any warnings they should be reviewed with your implementation team representative to see whether any corrective action needs to be taken.

ontologies/update/logs/knowledgeBaseUpdate.(timestamp).error.log: A log of errors that were encountered during the upgrade process. This file should be empty if the upgrade was successful.

ontologies/update/changedData/removedData.n3: An N3 file containing all the statements that were removed from the knowledge base.

ontologies/update/changedData/addedData.n3: An N3 file containing all the statements that were added to the knowledge base.

ii. Ontology knowledge base manual review

Changes to the VIVO core ontology may require corresponding modifications of the knowledge base instance data and local ontology extensions.

When Apache Tomcat starts up following the upgrade, it will initiate a process to examine the knowledge base and apply necessary changes. Not all of the modifications that may be required can be automated, so manual review of the knowledge base is recommended after the automated upgrade process. The automated process will make only the following types of changes:

Class or Property renaming: All references to the class (in the subject or object position) will be updated to the new name. References to the property will be updated to the new name.

Class or Property deletion: All type assertions of a deleted class will be removed.
All statements using a deleted property will be changed to use the nearest available superproperty. If there is no available superproperty then the statement will be deleted from the knowledge base. Note that all removed and added data is recorded in the files in the changedData directory.

Property addition: If a newly added property is the inverse of a previously existing property, the inverse of any statements using the pre-existing property will be asserted.

Annotation property default values: If a site has modified the value of a vitro annotation (such as displayRankAnnot or displayLimitAnnot) so that it is no longer using the default, then that setting will be left unchanged.
If a site is using the default value of a vitro annotation, and the default has been changed in the new version of the ontology, then the new default value will be propagated to the knowledge base.