1. Installation

1.1. System requirements

After complete installation of HGNC (gene symbols and names) and HCOP (orthology) data by PyHGNC ~1,441,250 rows in 22 tables need only ~380 MB of disk storage (depending on the used RDMS).

Tests were performed on Ubuntu 16.04, 4 x Intel Core i7-6560U CPU @ 2.20Ghz with 16 GiB of RAM. In general PyHGNC should work also on other systems like Windows, other Linux distributions or Mac OS. Installation were complete after ~4 min. For systems with low memory the option –low_memory was added in the update method.

1.2. Supported databases

PyHGNC uses SQLAlchemy to cover a wide spectrum of RDMSs (Relational database management systems). For best performance MySQL or MariaDB is recommended. But if you have no possibility to install software on your system, SQLite - which needs no further installation - also works. The following RDMSs are supported (by SQLAlchemy):

  1. Firebird
  2. Microsoft SQL Server
  3. MySQL / MariaDB
  4. Oracle
  5. PostgreSQL
  6. SQLite
  7. Sybase

1.3. Install software

The following instructions are written for Linux/MacOS. The way you install python software on Windows could be different.

Often it makes sense to avoid conflicts with other python installations by using different virtual environments. Read here about easy setup and management of different virtual environments.

  • If you want to install pyhgnc system wide use superuser (sudo for Ubuntu):
sudo pip install pyhgnc
  • If you have no sudo rights install as user
pip install --user pyhgnc
  • If you want to make sure you install pyhgnc in python3 environment:
sudo python3 -m pip install pyhgnc

1.3.1. MySQL/MariaDB setup

In general you don’t have to setup any database, because pyhgnc uses file based SQLite by default. But we strongly recommend to use MySQL/MariaDB.

Log in MySQL/MariaDB as root user and create a new database, create a user, assign the rights and flush privileges.

CREATE DATABASE pyhgnc CHARACTER SET utf8 COLLATE utf8_general_ci;
GRANT ALL PRIVILEGES ON pyhgnc.* TO 'pyhgnc_user'@'%' IDENTIFIED BY 'pyhgnc_passwd';

The simplest way to set the configurations of pyhgnc for MySQL/MariaDB is to use the command ...

pyhgnc mysql

... and accept all default values.

Another way is to open a python shell and set the MySQL configuration. If you have not changed anything in the SQL statements ...

import pyhgnc

If you have used you own settings, please adapt the following command to you requirements.

import pyhgnc
pyhgnc.set_mysql_connection(host='localhost', user='pyhgnc_user', passwd='pyhgnc_passwd', db='pyhgnc')

1.3.2. Updating

During the updating process PyHGNC will download HGNC and HCOP files from the EBI ftp server.

Downloaded files will take no space on your disk after the update process.

To update from command line or terminal:

pyhgnc update

Update options are available aswell, type pyhgnc update –help to get a full list with descriptions.

To update from Python shell:

import pyhgnc

1.4. Changing database configuration

Following functions allow to change the connection to your RDBMS (relational database management system). The connection settings will be used by default on the next time pyhgnc is executed.

To set a new MySQL/MariaDB connection use the interactive command line interface (bash, terminal, cmd) ...

pyhgnc mysql

... or in Python shell ...

import pyhgnc
pyhgnc.set_mysql_connection(host='localhost', user='pyhgnc_user', passwd='pyhgnc_passwd', db='pyhgnc')

To set connection to other database systems use the database.set_connection().

For more information about connection strings go to the SQLAlchemy documentation.

Examples for valid connection strings are:

  • mysql+pymysql://user:passwd@localhost/database?charset=utf8
  • postgresql://scott:tiger@localhost/mydatabase
  • mssql+pyodbc://user:passwd@database
  • oracle://user:passwd@
  • Linux: sqlite:////absolute/path/to/database.db
  • Windows: sqlite:///C:\path\to\database.db

You could use the following code to connect pyhgnc to an oracle database:

import pyhgnc