Skip to content

VIVO 3828: Remove SDB in preparation to moving to new Jena.

backups requested to merge github/fork/TAMULib/3828-remove_sdb into jena4-upgrade

Created by: kaladay

VIVO GitHub issue: https://github.com/vivo-project/VIVO/issues/3828

What does this pull request do?

The Jena version being switched to (Jena 4) has removed support for SDB.

The current forms of DatasetWrapperFactory and StaticDatasetFactory should no longer be needed.

The SDB related code has been stripped out. Many of the classes with "SDB" in the name that actually provide more than just SDB have been re-created. These recreations are generally child classes of a similarly named Jena class. These recreates have "DB" in their name rather than "SDB".

The DatasetFactory.createMem() is deprecated and may not be available in Jena 4. Attempts to replace this with createTxnMem() have revealed problems that are potentially transaction related. Replacing with createGeneral() might be possible but this is not done to avoid introducing more potential problems.

Notable points in regards to replacing DatasetFactory.createMem():

  1. The method is deprecated.
  2. The DatasetFactory.createGeneral() is the compatible equivalent.
  3. The DatasetFactory.createGeneral() better supports TDB (which is the main reason for including in this commit set).
  4. The DatasetFactory.createTxnMem() is the more recommended alternative and is transactional.
  5. This better prepares the code for the upcoming Jena upgrade changes.

There are some serious existing problems with closing dataset connections (and related). The (now removed) DatasetWrapperFactory probably should be rewritten to provide a new dataset each time rather than copying an existing one. The close() functionality appears to not be being called for TDB due to the SDB conditions. With this SDB condition remove the connections would the become closed when closed() is called. This has exposed several problems that both tests and runtime expose. This problem is considered out of scope for the changes and the close code is commented out with a FIXME statement.

The documentation for example.runtime.properties refers to VitroConnection.DataSource.* only being used by SDB but this is incorrect. The OpenSocial code appears to talk directly to an SQL database using these properties. Update the documentation in this regard and replace the references to SDB with OpenSocial.

Remove no longer necessary SDB helper code that is added to TDB as well because it "shouldn't hurt".

Remove much of the documentation and installation functionality using or referencing SDB.

The IndividualDaoSDB.java file is implemting getAllIndividualUris() and getUpdatedSinceIterator() via the now removed IndivdiualSDB class. This functionality is used by RebuildIndexTask and UpdateUrisTask classes.

The new *DB classes now have java docs. These java docs have a bare-bones level of detail.

How should this be tested?

Load the i18n data and try to use via the web interface.

Additional Notes:

The DatasetWrapperFactory that is removed in this PR is a factory of DatasetWrapper as described. However, the DatasetWrapper being passed has a local reference of the Dataset. The same Dataset, as far as I can tell, is being passed around. Creating a new DatasetWrapper for the same Dataset does not seem helpful when if that same Dataset is closed anywhere, all of the different DatasetWrapper instantiations holding a reference to that same Dataset will in actuality could become closed even though their respective wrapper does not necessirly have the closed boolean set.

This behavior is a bit confusing to me so I removed the implementation as I described above under the "What does this pull request do?" section. If this is doing something I am not seeing or undersanding, then the DatasetWrapperFactory could be restored. I think it would be worth while to consider creating/instantiating a new Dataset when the a new DatasetWrapper is created via the DatasetWrapperFactory. This would possible be of help in resolving the closure problems described above.

The StaticDatasetFactory is confusing to me. This class appears to provide a non-static instance of a Dataset being shared globally and is treated as if it were static. I am uncertain on the implications of this design decision.

I described above about the deprecation of DatasetFactory.createMem();. Further investigation has revealed that DatasetFactory.createMem(); is in fact removed as of Jena 4.0.0 and greater. The migration of that method call will be performed in the subsequent Jena migration related PRs.

Interested parties

@brianjlowe @vivo-project/vivo-committers

Merge request reports