Data storage services > Database service > Using the database service
Tehdyt toimenpiteet

Using the database service

Content



  • MySQL client programs in Hippu and Murska
  • Data import and export
  • Using MySQL client through batch job system
  • Using kaivos.csc.fi directly from your local computer
  • Following the disk usage in kaivos.csc.fi
  • MySQL application programming interfaces
  • Examples for using MySQL in Murska.csc.fi




MySQL client programs in Hippu and Murska


Below you can find instructions on how to use the MySQL client programs in CSC's computing environment. The SQL language is not covered, but you can find a lot of  documentation and guidance elsewhere (e.g., MySQL user manual).

The database service is used through MySQL client programs, which can be accessed by logging on the Hippu or Murska computing servers. Users open the connection from the computing server to the kaivos.csc.fi server, which provides the database service. Once the database user accounts have been received from CSC,  the user can start to create tables and store data in the empty database.

To use the MySQL client programs in Hippu, the set up command is:

     module load mysql

In Murska, first give the following set up command:

     module load testing/mysql

After this you can start the MySQL client program or execute some of the MySQL commands. A MySQL command line client session to kaivos.csc.fi is opened with the command:

mysql -u database_user_account –p -h kaivos.csc.fi --local  database_name


This starts an interactive MySQL session which you can use to execute SQL commands for your database.  In the command above options -u and -h define your database user account and the database server name (kaivos.csc.fi). The -p option defines that password is used for authentication. The --local selection is not always necessary, but it is useful as it indicates to the MySQL client that the MySQL connection comes from  a remote host.

Instead of giving the SQL commands interactively, you can also write the SQL commands into a file and execute them with the command:
         

mysql -u database_user_account -p -h kaivos.csc.fi --local  database_name < commands.sql > output.txt


or

mysql -u database_user_account –p -h kaivos.csc.fi --local  --skip-column-names --quick database_name < commands.sql > output.txt

In the latter command, the --skip-column-names option is used to print out only the data produced by the SQL commands. Otherwise names of the selected columns would be printed too. The --quick option makes the client to print each row as it is received instead of storing it to the cache first.


Setting default values for MySQL connection

In the mysql commands above, the client program would ask for the username and password every time when a mysql command is executed. It is however possible to define default values for the MySQL username and password, that will be used for the connection if no password or user name is given. Setting the default values is useful especially in cases where the user uses mostly just  one data base.

The default values for the MySQL connections is defined in a file called .my.cnf (note the dot in the beginning of the file name) that locates in the user's home directory. This definition file can be constructed in Hippu with a normal text editor or with command:

kaivos_mysql_cnf

Bellow is shown the basic structure of the .my.cnf file:

[client]
user = MySQL_username
password = MySQL_password
host = kaivos.csc.fi

[mysql]
database = database_name

Note that the computing nodes of Murska do not see the users home directory. If you wish to use the default values when you are running batch jobs, you must copy the .my.cnf file to the work directory ($WRKDIR) before launching the jobs.


You can also store the settings form MySQL connection to some other file name, and apply these settings by using option --defaults-extra-file=settings_file. For example, if we would like to use MySQL connection confiduration that is stored to file

db_conn2.def, we could execute the previosuly used MySQL query by using command:

mysql --defaults-extra-file=db_conn2.def --local  database_name < commands.sql > output.txt



Graphical Interfaces


Note that for interactive database usage you can also use graphical MySQL clients mysqlcc and mysql-query-browser. These tools, available at murska.csc.fi, are very efficient when you need to get familiar and administrate a complex database that contains a large number of tables. You must have an X-term connection between Murska and your local host to use these tools. To start the graphical interface, give the command:  

   mysqlcc        

or

   mysql-query-browser


The actual SQL command language is not discussed in this document in detail. Please use the MySQL manual or some of the many SQL-guides published for an introduction to SQL databases.


Data import and export


For loading large datasets into the database we recommend using command mysqlimport.  This command reads in a delimited text file into a table already existing in the database. To load a large text file from Murska to a database in Kaivos, you can use the command syntax:

mysqlimport -h kaivos.csc.fi -u database_user_account --local --compress --password database_name input_file.table

Mysqlimport strips any extension from the input file name and uses the result to determine the name of the table into which to import the file's contents. The user must make sure that the database contains a previously created table with data columns that match the data in the input data file. The user may also have to change the name of the input file if it is not compatible with a database table. The --local option defines that the data file locates in the machine where the client is used and not in the actual database server. Thus in the case of mysqlimport, this option is obligatory in Murska and Hippu.

Downloading complete tables or databases from the database can be done with command mysqldump. This command was developed for making backup copies of MySQL databases. In the case of kaivos.csc.fi backup copies are not needed as the database is automatically backup copied by CSC.  Instead mysqldump offers an easy way to make a copy of your database so that you can move both the data content and structure of your database to another SQL server. You can make a copy of the whole database:

mysqldump –u database_user_account -h kaivos.csc.fi –p database > database_dump.txt

or just from one or more tables:

mysqldump –u database_user_account -h kaivos.csc.fi –p  database table_name > table_dump.txt


When mysqldump is used with the default settings the result files contain MySQL commands that are needed to create and populate the selected database tables. Mysqldump-locks the table in the beginning of the copying. Because of this only the dbname_admin user account of the database can launch the command by default. In the case of other user accounts (dbname_read or  dbname_user)  --skip-lock-tables option should be added to the mysqldump command.


You can import the database_dump.txt file as follows: 

mysql –u database_user_account -h kaivos.csc.fi –p database < database_dump.txt


Using MySQL client through batch job system


The MySQL client program can be used in the batch job systems in the same way as in interactive client usage. The only difference is that in the batch jobs, the database password can't be given interactively. Insted is should be given by using MySQL configuration file and the --defaults-extra-file option.


Below is a sample MySQL script for Murska. First we need to create a MySQL connection cofiguration file that locates in the $WRKDIR directory of Muska.  In this case we use user account mydb1_admin, whose password is abc123.  The file, named as mysql.def, would now look like following:

[client]
user = mydb1_admin
password = abc123
host = kaivos.csc.fi


Then we create the actual batch job script. The script bellow reserves 12 h time and 1 GB memory to run the MySQL query that is defined in the file query_commands.sql. The query is made to database mydb1 and the doccection parameters are red from file mysql.def. The results are written to the results.txt file.

#!/bin/tcsh
#BSUB -L /bin/csh
#BSUB -J mysql
#BSUB -e mysql_job%J
#BSUB -o mysql_job%J
#BSUB -M 1048576
#BSUB -W 12:00

module load mysql
cd $WRKDIR/my_data

mysql --local --defaults-extra-file=$WRKDIR/mysql.def mydb1 <query_commands.sql > results.txt


Using kaivos.csc.fi directly from your local computer


The MySQL databases in kaivos.csc.fi can be directly accessed only from the computing servers of CSC. However, you can make the database visible to your own computer using your CSC user account and port forwarding through an ssh tunnel.


In linux and MacOSX machines an ssh tunnel from your local computer to kaivos.csc.fi via hippu.csc.fi can be done for example with the command:

ssh -l csc_user_account -L 3306:kaivos.csc.fi:3306 hippu.csc.fi -N

In Windows machines you can use e.g. plink program to open the tunnel. Plink can only be used through the command prompt. Bellow is a sample command for using plink:

plink -L 3306:kaivos.csc.fi:3306 hippu_username@hippu.csc.fi


The above commands define that the communication to port 3306 (the default port of MySQL) in you local computer is transported to the MySQL port of kaivos.csc.fi through hippu.csc.fi server. The –N option in the end of the connection command blocks the command shell after the connection is established. Once the connection is closed also the port forwarding becomes disabled. Note that the ssh command above uses your CSC user account and password, not the database user account.

As long as the connection to hippu.csc.fi is active the database in kaivos.csc.fi can be accessed from your local computer using the same MySQL commands as described above for Murska and Hippu (assuming that you have the MySQL client program installed in your local computer). The only difference compared to previous command examples is that the host section  (-h) should now point to host 127.0.0.1 that refers to your local host, instead of kaivos.csc.fi.

So for example the syntax to open an interactive MySQL session would now be:

mysql -u database_user_account –p -h 127.0.0.1 --local  database_name


And the syntax for mysqlimport:

mysqlimport -h 127.0.0.1 -u database_user_account --local --compress --password database_name input_file.table

In the same way, you can make your databases visible in your local computer using locally installed graphical MySQL interfaces. The only major limitation with the port forwarding is that normally ssh tunnels are not very stable. You may need to reopen the ssh connection every now and then and you should not trust that the tunnel remains usable for several hours.

For the security reasons we recommend that you always close the ssh connection when you stop using the database. 


Following the disk usage in kaivos.csc.fi


Each user has only limited disc space available in the kaivos.csc.fi server. If the database reaches the disk quota, the database users can no longer write to the database. In these cases the users should clean up the database to reduce the size or apply more disk space from CSC.

In murska.csc.fi, you can check the database quota and usage with the command:

    mysql_quota


or

  mysql_quota –u database_user_account –p database_password


The situation can also be checked during the MySQL terminal session. To see the disk quota in kaivos.csc.fi, give the MySQL command:

  CALL quotadb.quota();


The current size of the database can be checked with the MySQL command:
   

  CALL quotadb.usedquota();



MySQL application programming interfaces


More sophisticated MySQL usage is often performed with MySQL commands embedded to other programs or through language specific application programming interfaces (API). Support for this kind of usage in Murska is however rather limited and the MySQL API selection is not actively maintained. Please consult CSC about the availability of different APIs.



Examples for using MySQL in Murska.csc.fi

Example 1. Creating a table


In the following example we create a new  table called results into an empty database DB_A.

First we connect to the database with the MySQL client:

     mysql –u DB_A_admin -h kaivos.csc.fi –p DB_A


Next we create a new table that contains three columns: id, value and comment. Id column is in this case defined to be a unique integer, value column contains floating point numbers and comment column text data (a non binary string with max. 30 characters).  Note that in real life you normally define many other features like the primary key column and auto filling etc. when you create a new table.

mysql> CREATE TABLE results (id INT UNIQUE, value FLOAT, comment VARCHAR(30));


You can now use SHOW TABLES to see which tables your database contains.

mysql> SHOW TABLES;
+----------------+
| Tables_in_DB_A |
+----------------+
| results        |
+----------------+
1 row in set (0.01 sec)



Data can be inserted to the table with the command INSERT INTO. Below we insert three new lines to the table:

mysql> INSERT INTO results (id, value, comment) VALUES (1,  27.45 , "Test case" );
mysql> INSERT INTO results (id, value, comment) VALUES (2,  12.33 , "Another" );
mysql> INSERT INTO results (id, value, comment) VALUES (3,  25.33 , "Value2" );


When the table contains data, we can now do SQL queries with the SELECT command:

mysql> SELECT value FROM results WHERE id=2;
+-------+
| value |
+-------+
| 12.33 |
+-------+


The MySQL client session is closed with the command EXIT:

mysql>EXIT



Example 2. Importing data with mysqlimport


In this example we import a dataset into the result table created in example1.  The data to be imported to the database locates in file: data_to_import.txt. This file contains data rows like:

1       -419.557        STRUCTURE1.PDB
2       -479.662        STRUCTURE2.PDB
3       -517.019        STRUCTURE3.PDB
4       -450.922        STRUCTURE4.PDB
5       -421.991        STRUCTURE5.PDB
6       -507.076        STRUCTURE6.PDB
7       -444.598        STRUCTURE7.PDB
8       -444.552        STRUCTURE8.PDB
9       -414.492        STRUCTURE9.PDB
10      -444.549        STRUCTURE10.PDB
11      -463.394        STRUCTURE11.PDB
12      -430.548        STRUCTURE12.PDB
etc…

To import the data to the results-table, we must first copy the data to a file with a name that is compatible with the table name.   

cp data_to_import.txt results.table


Then use mysqlimport to import the data

   mysqlimport -h kaivos.csc.fi --local --compress –p –u DB_A_admin DB_A results.table


 Example 3. Using MySQL database from a batch job script in Murska


The MySQL database in kaivos.csc.fi is intended for cases where computing servers of CSC use the MySQL database to store and analyze data. In these cases the database is normally not used interactively, but the MySQL client is used automatically from a shell or program script.  The database is accessed using the database user account DB_A_admin  and password abc123.  The information is first stored into a file named as mysql.def.  This file locates in the $WRKDIR-directory and conatins following rows:

[client]
user = DB_A_admin
password = abc123
host = kaivos.csc.fi

Below is a sample batch job script that utilizes kaivos.csc.fi within the batch queue system.

#!/bin/csh
#BSUB -L /bin/csh
#BSUB -J jobname
#BSUB -e output
#BSUB -o error
#BSUB -M 2097152
#BSUB -W 6:00

#load mysql environment
module load testing/mysql

#go to the right directory
cd datadir

# run the analysis
my_program < inputfile30.data > results.30

#feed the data to the database
mysqlimport --defaults-extra-file=$WRKDIR/mysql.def --local --compress DB_A results.30

#change the status value in the dataset_table
mysql --defaults-extra-file=$WRKDIR/mysql.def --local DB_A <<EOF
update dataset_table set status=”done" where name="inputfile30.data" ;
EOF

#remove the original results file
rm –f results.30


The sample script above first analyzes a file called inputfile30.data with program my_program. The results are first written to file called results.30. The data in this file is then imported to a database with mysqlimport command. Note that the script assumes that a table called results already exists in the database DB_A and that the columns in the results file are in the same order as in the database table.

After importing the data to the database the script launches another MySQL command. The second command modifies an existing table called dataset_table. The mysql command changes the status value in this table so that in the row where the name column contains the value inputfile30.data that status column gets the value  done.


Example 4. Using Kaivos through the perl DBI module


#!/v/linux26_x86_64/appl/molbio/bioperl/perl/bin/perl
use strict;
#use the DBI module
use DBI;

#pick the parameters from the command line
my $arg=\@ARGV;
my $threshold = "$arg->[0]

#Define parameters for the database connection
my $DB_URL = "dbi:mysql:DB_A;kaivos.csc.fi";
my $DB_USER = 'DB_A_user";
my $DB_PASSWD = "abc123";

#Open the connection to the database
my $dbh = DBI->connect(
                $DB_URL, $DB_USER,
                $DB_PASSWD,
                {
                        PrintError => 0,
                       AutoCommit => 1,
                } 
                   );

#Define and prepare the SQL statement
my $sql = "SELECT comment, value FROM results  WHERE value<?";
my $sth = $dbh->prepare($sql);

# Execute the quiery. The question mark in the query statement will be
# replaced by the value of the threshold variable
my $numrows = $sth->execute("$threshold");
print "The number of result rows found: $numrows. \n" ;

# $sth->fetchrow_array retrieves one result rows at a time
# read all result rows by using while loio
while ( my @row = $sth->fetchrow_array ) {
for (@row) {
print "$_\t";
}
print "\n";
}

# close the query
$sth->finish;

# close the database connection
$dbh->disconnect;