The first challenge of the OhaTV pilot was harmonizing
the concepts, or in practice, creating a common language. Next, the storage
procedures relating to information on teaching and degrees had to be
harmonized, which turned out to be a major challenge. The information needed
was dug out from the universities’ own information systems, it was harmonized
and transferred into the shared data warehouse. The University
of Jyväskylä, the Universities of
Oulu, and the Swedish
School of Economics and
Business Administration were the piloting institutions, with Jyväskylä first to
be connected to the system. Connecting the other universities was easier, and
the work performed so far will reduce the work load later.
Data in a warehouse – money in the bank
Information is a valuable resource in the administration of higher education institutions. Hence, reliable information concerning administration should be easily available both for the institutions themselves and for the authorities for statistical purposes. Data warehousing for academic administration is an answer to this need. It serves to organize student, finance, and personnel administration information.
CSC – IT Center for Science, Ltd. coordinates the data warehousing project, being responsible for the practical arrangements led by a steering group.
“We offer centralized and common service for 50 universities and polytechnics in Finland. The estimated cost of building a data warehouse for one higher education institution is EUR 1.2 million, so if each one were to create its own warehouse, the total cost would be 60 million", says Application Specialist Teemu Kemppainen from CSC.
“By creating a shared data warehouse, solutions made once can easily be copied. This can reduce the costs per institution to less than a tenth. At the same time, the information will all be in the same form and suitable for comparison due to the common concept model. A centralized data warehousing service is the logical solution”, says Kemppainen.
“By creating a shared data warehouse, solutions made once can easily be copied. This can reduce the costs per institution to less than a tenth,” says Application Specialist Teemu Kemppainen from CSC. © Jyrki Hokkanen.
Three hundred concepts agreed
The reporting requirements of the Ministry of Education, Statistics Finland, and Kela, the Social Insurance Institution of Finland, as well as the information needed for managerial needs at the higher education institutions necessitate information to be collected from several information systems and sources. Previously this was performed manually.
There are numerous matters to be sorted out: What is the cost of a master’s degree? Is there variation in the price between the different institutions? Which study modules consume the most resources? Which ones are the most popular – and which ones have the highest numbers of drop-outs? Furthermore, information is often available in different versions. Against this background, the need for building the data warehouse was initiated by the higher education institutions themselves.
The work was started already during the OhaTV project with concept definition, in which a kind of building plan for the data warehouse was drawn up. The concepts were harmonized between the different higher education institutions and interrelationships between the concepts were defined.
Concept definition work was carried out in broad-based working groups, including specialists from the higher education institutions’ finance, personnel and administration functions. To provide external input, representatives from Statistics Finland participated in the work.
The working groups created a model of 300 concepts and a draft glossary during last spring. After the commentary circulation at the institutions, hundreds of revisions were made to the concept model during the summer. From now on, the glossaries will be maintained on a continuous basis and insufficient definitions will be made more detailed in conjunction with routine maintenance. The goal is that equivalent functions are always described with the same concepts.
Shared source of information
“A data warehouse based on a common concept model enables reliable comparison of information between the different higher education institutions. Data warehousing helps to create a unified understanding of information that serves as the basis for decision-making”, Kemppainen continues.
Comparable data is required by authorities, including the Ministry of Education, where funding decisions are made by comparing corresponding figures. From now on, the work on annual operating budgets and planning at higher education institutions will be easier, because the background data is the same for all. The system enables data sharing and the use of distributed data. As for data security, the warehousing will be arranged so that data is accessible only by authorized users.
This autumn the data warehouse will be tested in phases at three institutions mainly concerning student, finance, personnel, and facility administration, as well as research and development activities. When the procedures are documented, the work will be extended to seven other institutions.
Through testing to serious action
In practice, the testing phase is the same as implementation. A data warehouse is currently being created for the three higher education institutions, the University of Oulu, Aalto University, and HAMK University of Applied Sciences, and the warehouse is gradually being utilized in everyday work.
“The data attained from the source systems is taken into the DWH database, in a relational database, and the system queries whether the desired information is found within the source systems’ contents. These source systems include, for example, various financial administration programs, which are now made to discuss with the DWH database”, says Project Manager Heikki Haavisto from CSC explaining the practical work in the project.
The test work involves error checking and determining specifications for the deductive terms. The mechanisms of data transfer are addressed, and we also ponder how to arrange the data transfer steps by first splitting and then reassembling it. In addition, we also implement a few internal and external reporting routines for the higher education institutions.
“Should the data query produce ambiguous or unreliable information, unifying information must be added to the source data. For example, variation may occur in individuals' names, like first name-last name or last name-first name, with or without a comma, so the personal identification number is queried as the linking factor”, Haavisto explains.
Alongside this project, the concept model is continuously being refined, and in this work the individual higher education institutions’ own specialists play a key role. They know best what each term embraces in the real world. The concepts become more focused, when they are tested with real data.
“The first official version of the service will be launched after the end of the year. The implementation built in the test framework can then be taken into production use, and each higher education institution can add reports to it as part of its own further development. The testing of new subfields will continue throughout 2010, and by the end of the year ten modules will be completed”, explains Haavisto, reviewing the work ahead.
Anneli Frantti
Figure © Sanakunta
RAKETTI
The data warehousing project is one of the four subprojects of RAKETTI (RAkenteellisen KEhittämisen Tukena TIetohallinto), the joint project of Finnish higher education institutions and the Ministry of Education. In addition to the shared data warehouse and common concept model, the work involves defining a total architecture model, preparing a new basic information system for the academic administration, and developing IT solutions for research and research administration for higher education institutions. CSC coordinates the RAKETTI project.