VIVO Release 1 V1.2 Installation Guide

January 28, 2011
Missing pieces and fixes

Release anouncement for V1.2

Text from the wiki page

Installation process for V1.2

This document is a summary of the VIVO installation process. This and other documentation can be found on the support page at VIVOweb.org

VIVO Developers: If you are working on the VIVO source code from Subversion, the instructions are slightly different. Please consult developers.txt in this directory.

Steps to Installation

  1. Install required software
  2. Create an empty MySQL database
  3. Download the VIVO Application Source
  4. Specify deployment properties
  5. Compile and deploy
  6. Set Tomcat JVM parameters and security limits
  7. Start Tomcat
  8. Log in and add RDF data
  9. Set the Contact Email Address (if using "Contact Us" form)
  10. Setup Apache Tomcat Connector
  11. Configure Pellet Reasoner
  12. Using an External Authentication System with VIVO
  13. Was the installation successful?

I. Install required software

Before installing VIVO, make sure that the following software is installed on the desired machine:

Be sure to setup the environment variables for and ANT_HOME and add the executables to your path per your operating system and installation directions from the software support web sites.

II. Create an empty MySQL database

Decide on a database name, username, and password. Log into your MySQL server and create a new database in MySQL that uses UTF-8 encoding. You will need these values for Step IV when you configure the deployment properties. At the MySQL command line you can create the database and user with these commands substituting your values for dbname, username, and password. Most of the time, the hostname will equal localhost.

                CREATE DATABASE dbname CHARACTER SET utf8;
            

Grant access to a database user. For example:

                GRANT ALL ON dbname.* TO 'username'@'hostname' IDENTIFIED BY 'password';
            

Keep track of the database name, username, and password for Step IV.

III. Download the VIVO Application Source

Download the VIVO application source as either rel-1.2.zip or rel-1.2.gz file and unpack it on your web server:
http://vivoweb.org/download

IV. Specify deployment properties

At the top level of the unpacked distribution, copy the file example.deploy.properties to a file named simply deploy.properties. This file sets several properties used in compilation and deployment.

Windows: For those installing on Windows operating system, include the windows drive and use the forward slash "/" and not the back slash "\" in the directory locations, e.g. c:/tomcat.

External authentication: If you want to use an external authentication system like Shibboleth or CUWebAuth, you will need to set two additional properties in this file. See the section below entitled Using an External Authentication System with VIVO.

Property Name Example Value
Default namespace: VIVO installations make their RDF resources available for harvest using linked data. Requests for RDF resource URIs redirect to HTML or RDF representations as specified by the client. To make this possible, VIVO's default namespace must have certain structure and begin with the public web address of the VIVO installation. For example, if the web address of a VIVO installation is "http://vivo.example.edu/" the default namespace must be set to "http://vivo.example.edu/individual/" in order to support linked data. Similarly, if VIVO is installed at "http://www.example.edu/vivo" the default namespace must be set to "http://www.example.edu/vivo/individual/"
* The namespace must end with "individual/" (including the trailing slash).
Vitro.defaultNamespace http://vivo.mydomain.edu/individual/
Directory where Vitro code is located. In most deployments, this is set to ./vitro-core, but it commonly points elsewhere during development.
vitro.core.dir ./vitro-core
Directory where tomcat is installed.
tomcat.home /usr/local/tomcat
Name of your VIVO application.
webapp.name vivo
Directory where uploaded files will be stored. You must create this directory ahead of time.
upload.directory /usr/local/vivo/data/uploads
Directory where the Lucene search index will be built. Depending on your permissions and who Tomcat is running as, you may need to create this directory ahead of time.
LuceneSetup.indexDir /usr/local/vivo/data/luceneIndex
Specify an SMTP host that the form will use for sending e-mail (Optional). If this is left blank, the contact form will be hidden and disabled.
Vitro.smtpHost smtp.servername.edu
Specify the JDBC URL of your database. Change the end of theURL to reflect your database name (if it is not "vivo").
VitroConnection.DataSource.url jdbc:mysql://localhost/vivo
Change the username to match the authorized user you created in MySQL.
VitroConnection.DataSource.username username
Change the password to match the password you created in MySQL.
VitroConnection.DataSource.password password
Specify the Jena triple store technology to use. SDB is Jena's SPARQL database; this setting allows RDF data to scale beyond the limits of the JVM heap. Set to RDB to use the older Jena RDB store with in-memory caching.
VitroConnection.DataSource.tripleStoreType SDB
Specify the maximum number of active connections in the database connection pool to support the anticipated number of concurrent page requests. It is not necessary to adjust this value when using the RDB configuration.
VitroConnection.DataSource.pool.maxActive 40
Specify the maximum number of database connections that will be allowed to remain idle in the connection pool. Default is 25% of the maximum number of active connections.
VitroConnection.DataSource.pool.maxIdle 10
Change the dbtype setting to use a database other than MySQL. Otherwise, leave this value unchanged. Possible values are DB2, derby, HSQLDB, H2, MySQL, Oracle, PostgreSQL, and SQLServer. Refer to http://openjena.org/wiki/SDB/Databases_Supported for additional information.
VitroConnection.DataSource.dbtype MySQL
Specify a driver class name to use a database other than MySQL. Otherwise, leave this value unchanged. This JAR file for this driver must be added to the the webapp/lib directory within the vitro.core.dir specified above.
VitroConnection.DataSource.driver com.mysql.jdbc.Driver
Change the validation query used to test database connections only if necessary to use a database other than MySQL. Otherwise, leave this value unchanged.
VitroConnection.DataSource.validationQuery SELECT 1
Specify the name of your first admin user for the VIVO application. This user will have an initial temporary password of 'defaultAdmin'. You will be prompted to create a new password on first login.
initialAdminUser defaultAdmin
The name of a property that can be used to associate an Individual with a user account. When a user logs in with a name that matches the value of this property, the user will be authorized to editthat Individual.
selfEditing.idMatchingProperty http://vivo.mydomain.edu/ns#networkId
Temporal Graph Visualization is used to compare different organizations/people within an organization on different parameters like number of publications, grants. This parameter will be used as a default in case a URI is not provided. It will be also used whenever this visualization is to be rendered for top level organization. In absence of this parameter a SPARQL query will be fired which will attempt to provide a top level organization. The name of a property that can be used to associate an Individual with a user account. When a user logs in with a name that matches the value of this property, the user will be authorized to edit that Individual.
visualization.topLevelOrg http://vivo-trunk.indiana.edu/individual/topLevelOrgURI

V. Compile and deploy

At the command line, from the top level of the unpacked distribution directory, type:

                ant all
            

to build VIVO and deploy to Tomcat's webapps directory.

VI. Set Tomcat JVM parameters and security limits

Currently, VIVO copies the contents of your RDF database into memory in order to serve Web requests quickly (the in-memory copy and the underlying databaseare kept in synch as edits are performed).

VIVO will require more memory than that allocated to Tomcat by default. With most installations of Tomcat, the "setenv.sh" or "setenv.bat" file in Tomcat's bin directory is a convenient place to set the memory parameters.
For example:

                    export CATALINA_OPTS="-Xms2048m -Xmx1024m -XX:MaxPermSize=128m"
                

This sets Tomcat to allocate an initial heap of 2048 megabytes, a maximum heap of 1024 megabytes, and a PermGen space of 128 megs. 1024 megabytes is a minimum practical heap size for production installations storing data for large academic institutions, and additional heap space is preferable. For testing with small sets of data, 256m to 512m should be sufficient.

If an OutOfMemoryError is encountered during VIVO execution, it can be remedied by increasing the heap parameters and restarting Tomcat.

Security limits: VIVO is a multithreaded web application that may require more threads than are permitted under your Linux installation's default configuration. Ensure that your installation can support the required number of threads by making the following edits to /etc/security/limits.conf:

                    apache	hard	nproc	400
                    tomcat6	hard	nproc	1500 
                

VII. Start Tomcat

Most Tomcat installations can be started by running startup.sh or startup.bat in Tomcat's bin directory. Point your browser to "http://localhost:8080/vivo/" to test the application. If Tomcat does not start up, or the VIVO application is not visible, check the catalina.out file in Tomcat's logs directory.

VIII. Log in and add RDF data

If the startup was successful, you will see a welcome message informing you that you have successfully installed VIVO. Click the "Log in" link near the upper right corner. Log in with the initialAdminUser username you set up in Step IV. The initial password for the initialAdminUser account is "defaultAdmin" (without the quotes). On first login, you will be prompted to select a new password and verify it a second time.

After verifying your new password, you will be presented with a menu of editing options. Here you can create OWL classes, object properties, data properties, and configure the display of data. Currently, any classes you wish to make visible on your website must be part of a class group, and there a number of visibility and display options available for each ontology entity. VIVO comes with a core VIVO ontology, but you may also upload other ontologies from an RDF file.

Under the "Advanced Data Tools" click "Add/Remove RDF Data." Note that Vitro currently works best with OWL-DL ontologies and has only limited support for pure RDF data. You can enter a URL pointing to the RDF data you wish to load or upload a file on your local machine. Ensure that the "add RDF" radio button is selected. You will also likely want to check "create classgroups automatically."

Clicking the "Index" tab in the navigation bar at the top left of the page will show a simple index of the knowledge base.

See more documentation for configuring VIVO, ingesting data, and manually adding data at http://vivoweb.org/support.

IX. Set the Contact Email Address (if using "Contact Us" form)

If you have configured your application to use the "Contact Us" feature in Step IV (Vitro.smtpHost), you will also need to add an email address to the VIVO application.  This is the email that the contact form submits to. It can be a list server or an individual's email address.

Log in as a system administrator. Navigate to the "Site Admin" table of contents (link in the right side of the header). Go to "Site Information" (under "Site Configuration"). In the "Site Information Editing Form," enter a functional email address in the field "Contact Email Address." and submit the change.

If you set theVitro.smtpHost in Step IV and do NOT provide an email addressin this step, your users will receive a java error in the interface.

X. Set up Apache Tomcat Connector

It is recommended that a Tomcat Connector such as mod_jk be used to ensure that the site address does not include the port number (e.g. 8080) and an additional reference to the Tomcat context name (e.g. /vivo).

This will make VIVO available at "http://example.com" instead of "http://example.com:8080/vivo"

Using the mod_jk connector allows for communication between Tomcat and the primary web server. The Quick Start HowTo on the Apache site describes the minimum server configurations for several popular web servers.

After setting up the mod_jk connector above, you will need to modify the Tomcat's server.xml (located in [tomcat root]/conf/) to respond to requests from Apache via the connector. Look for the <connector> directive and add the following properties:

                connectionTimeout="20000" maxThreads="320" keepAliveTimeout="20000"  
            

Note: the value for maxThreads (320) is equal to the value for MaxClients in the apache's httpd.conf file.

Locate the <Host name="localhost"...> directive and update as follows:

	    <Host name="localhost" appBase="webapps"
	        DeployOnStartup="false"
	        unpackWARs="true" autoDeploy="false"
	        xmlValidation="false" xmlNamespaceAware="false">
	
		<Alias>example.com</Alias>
		<Context path=""
			docBase="/usr/local/tomcat/webapps/vivo"
			reloadable="true"
			cookies="true" >
			<Manager pathname="" />
			<Environment type="java.lang.String" override="false" 
				name="path.configuration" 
				value="deploy.properties"
			/>
		</Context>
		...
			

XI. Configure Pellet Reasoner

Do we need this section still? - elly

VIVO uses the Pellet engine to perform reasoning, which runs in the background at startup and also when the knowledge base is edited. VIVO continues serving pages while the reasoner continues working; when the reasoner finishes, the new inferences appear. Inferred statements are cached in a database graph so that they are available immediately when VIVO is restarted.

By default, Pellet is fed only an incomplete view of your ontology and only certain inferences are materialized. These include rdf:type, rdfs:subClassOf, owl:equivalentClass, and owl:disjointWith. This mode is typically suitable for ontologies with a lot of instance data. If you would like to keep the default mode, skip to the next step.

To enable "complete" OWL inference (materialize all significant entailed statements), open "vitro-core/webapp/config/web.xml" and search for PelletReasonerSetup.

Then change the name of the listener class to PelletReasonerSetupComplete. Because "complete" reasoning can be very resource intensive, there is also an option to materialize nearly all inferences except owl:sameAs and owl:differentFrom.

This is enabled by specifying PelletReasonerSetupPseudocomplete. For ontologies with large numbers of individuals, this mode can offer enormous performance improvements over the "complete" mode.

Finally, a class called PelletReasonerSetupPseudocompleteIgnoreDataproperties is provided to improve performance on ontologies with large literals where data property entailments are not needed.

XII. Using an External Authentication System with VIVO

VIVO can be configured to work with an external authentication system like Shibboleth or CUWebAuth.

VIVO must be accessible only through an Apache HTTP server. The Apache server will be configured to invoke the external authentication system. When the user completes the authentication, the Apache server will pass a network ID to VIVO, to identify the user.

If VIVO has an account for that user, the user will be logged in with the privileges of that account. In the absence of an account, VIVO will try to find a page associated with the user. If such a page is found, the user can log in to edit his own profile information.

Configuring the Apache server

Your institution will provide you with instructions for setting up the external authentication system. The Apache server must be configured to secure a page in VIVO. When a user reaches this secured page, the Apache server will invoke the external authentication system.

For VIVO, this secured page is named: /loginExternalAuthReturn

When your instructions call for the location of the secured page, this is the value you should use.

Configuring VIVO

To enable external authentication, VIVO requires three values in the deploy.properties file.

XIII. Was the installation successful?

If you have completed the previous steps, you have good indications that the installation was successful.

Here is a simple test to see whether the ontology files were loaded:

Here is a test to see whether your system is configured to serve linked data:

Finally, test the search index.