Research Data Repository Services Delivered in Stage One

The Research Data Repository over the last year has delivered an impressive array of infrastructure and services.

Services that exist now

Service How the service is delivered
  • A researcher can request a Research Shared Drive up to 1 TB with multiple users and access anywhere on UWS campus. FAQ is online.
  • The request can originate from the researcher or from eResearch, and then the ITS team provision the share in accordance with the Support plan.
  • The request form is online.
  • A researcher can back up their git repository onto Research Data Store.
  • The service is delivered ad hoc by the eResearch team.
  • A researcher can request a virtual machine.
  • The request can originate from the researcher or from eResearch, and then the ITS team provision the virtual machine in accordance with the relevant SOP.
  • A researcher can deposit their research data in the Research Data Catalogue.
  • There are two ways to initiate the request, by Self-service by using an online form.
  • Or, in discussions with eResearch and the Library Research Services team.
  • Once initiated, the Research Services Coordinator – Library follows Library procedure in creating a new collection record and storing the data collection (as applicable).
  • Library systems can harvest metadata from UWS and web sources of truth, on a regular basis.
  • This metadata is stored in the Research Data Catalogue and provides lookup for applications like ReDBox and HIEv.
  • The service is delivered in accordance with Library procedures.
  • This is self-service by obtaining a copy of the checklist online, with support from the eResearch team as needed.
  • This is self-service by obtaining a copy of the checklist online, with support from the eResearch team as needed. The eResearch team can and do occasionally write Data Management Plans on behalf of the researchers, using the same template.
  • This is self-service via reading website content and following links for more information with assistance from the eResearch team.

External services that we are supporting

Service How the service is delivered
  • A researcher can obtain a NeCTAR virtual machine up to 2 cores at a time for up to 3 months (eResearch can assist with access and set up).
  • A researcher can apply for a medium and large (high intensity) virtual machines from NeCTAR.
  • A researcher can get a Cloudstor+ account through AARNET, this is cloud storage for research, located within Australia (eResearch are actively promoting this service and seeking user evaluations of it.
  • A ReDBox administrator can initiate a bug fix or issue with QCIF for resolution.
  • QCIF provide support, with assistance from the eResearch team.

What infrastructure has been delivered

Infrastructure – Storage
  • 127 Tb of high quality disk for researchers and research related-uses has been deployed. This storage is highly flexible and extensible and can be utilised as SAN or NAS depending on the need. > Migration of all data from old 70 Tb SAN
  • Established new service, Research Shared Drive (SIF share) > New FAQ/README with instructions for install, and also best practices in data management > New support plan through close coordination with eResearch and ITS > 10 research teams are currently using the RDS.
  • Storage has been connected to a number of virtual machines for research specific projects and applications.
Collaborative Storage
  • Explored and trialled several collaborative storage solutions, including Oxygen Cloud, WOS cloud, SparkleShare, and OwnCloud.
  • Selected OwnCloud based on experience at other organisations (such as AARNET and Lincoln University in UK).
  • A trial was conducted whereby a link was made between Dropbox and the Research Shared Drive. The team set up a Dropbox account which can receive a copy of a researcher’s Dropbox, and store that data on the same Researcher’s Shared Drive. This system is still in development stages.
  • A trial was conducted whereby a link was made between Source Code Repositories (version control systems) and the Research Data Store. The link is demonstrated by a UWS git server which clones public access git repositories. By way of example, we cloned the eResearch-apps repository.
Up Next:
  • Trial a collaborative storage option based on OwnCloud.
  • Establish a mechanism by which a user pushes their git repository to UWS storage.
  • Serve the needs of researchers who use other version control systems such as Mercurial and SubVersion.
Infrastructure – Compute
  • 4 servers have been provisioned for research use, 2 existing from HIE, and 2 provided through RDR, this is the Research Cluster.
  • The Research Cluster comprises 160 processor cores and 1024 Gb of memory available.
  • 6 vm’s which had been created previously were successfully migrated onto the Research Cluster.
  • There are 9 virtual machines which have been created in the research cluster, with plans to migrate more virtual machines across from the School of Medicine and other schools and institutes.
  • We can provision up to approximately 40 ‘medium intensity’ virtual machines.
Up Next:
  • Create canned virtual machines which comes ready-ready with tools needed to analyse data.
Infrastructure – Software
  • New packaging software was developed for research data, called CrateIt (Cr8it). Cr8it was started under two different approaches. The first approach was to leverage a toolset called The Fascinator, and the other approach was to incorporate new features into OwnCloud.
  • Document conversion, such as ePub generation, was ported into OwnCloud-Cr8it.
  • An automatic generation of a combined metadata catalogue record plus manifest was started. The manifest will be human and machine readable, leveraging work done by the HIEv (DC21) project.
Up Next:
  • Create a Cr8it trial and roll it out.
  • Flesh out what metadata record needs to be created by the Cr8it packaging process.
Research Data Catalogue
  • A simple form was developed that a researcher can use to indicate that they have a data set they would like to archive.
  • A pro forma questionnaire has been developed by the Research Services team at the Library. A process for including a new data set was also developed by the Library Research team.
  • 3 new procedure documents were created which formalised the ingest of metadata from RHESYS (University Research Management System) and from external sources, such as ReDBox wiki, NHMRC and ARC. Approximately 1,500 researchers and 500 projects are in the Research Data Catalogue available via lookup when a new data collection record is created.
  • New Research Data Catalogue entries (30+) were added to Research Data Australia, searchable by anyone with web access.
  • The ReDBox application was set up so that people who create data sets at UWS also have their unique details merged with an existing (or newly created) record in the National Library of Australia database, which is linked to any other data sets or publications which they have created in the same field or under the same name.
  • A new feature in ReDBox was added whereby an administrator can view the results of ingesting records about people and research projects. These results are presented in the form of ingest reports, describing what was ingested, modified, or removed, to support Quality Assurance going forward.
  • A ReDBox support agreement was negotiated with QCIF, which provides bug fixes and technical support until December 2014.
  • A new wizard for creating a data management plan inside the data catalogue is currently being trialled. The idea is that any data management plan which is created will be stored in the catalogue along with the data, and can be exported as a pdf if needed.
Services – Research Data Management
  • A new Data Management Plan Checklist was created.
  • A new Data Management Plan Template was created.
  • Additional page was added to the Office of Research Services pages, which included: > Data Management defined, > Data Management best practices, > Links to RDR services, > Links to external services and more information as applicable, and > Standard pro forma language that researchers can use to complete their research application forms.
  • Internal application forms were improved to ask researchers to explain how data management will be addressed, including: > Internal grant application for UWS funded research, and, > Application form to start new external grant application through ORS.
  • eResearch interviewed researchers with live projects and created 3 Data Management Plans using the Data Management Plan Template, plans which have been provided to the researchers.
  • eResearch interviewed managers of research facilities and drafted 4 Data Management Plans thus far, which have been provided to the facility managers.
Up Next:
  • Finalise Data Management Plans for our research facilities. In addition eResearch is currently assisting with new shared drives for these facilities (this is really BAU but is within the scope of the project).
  • Deposit the Data Management Plans in the Research Data Catalogue.