3. Query functions¶
Contents
3.1. Before you query¶
3.1.1. 1. You can use % as a wildcard.¶
import pyhgnc
query = pyhgnc.query()
# exact search
query.hgnc(name='amyloid beta precursor protein')
# starts with 'amyloid beta'
query.hgnc(name='amyloid beta %')
# ends with 'precursor protein'
query.hgnc(name='% precursor protein')
# contains 'precursor'
query.hgnc(name='%precursor%')
3.1.2. 2. limit to restrict number of results¶
import pyhgnc
query = pyhgnc.query()
query.hgnc(limit=10)
Use an offset by paring a tuple (page_number, number_of_results_per_page) to the parameter limit.
page_number starts with 0!
import pyhgnc
query = pyhgnc.query()
# first page with 3 results (every page have 3 results)
query.hgnc(limit=(0,3))
# fourth page with 10 results (every page have 10 results)
query.hgnc(limit=(4,10))
3.1.3. 3. Return pandas.DataFrame
as result¶
This is very useful if you want to profit from amazing pandas functions.
import pyhgnc
query = pyhgnc.query()
query.hgnc(as_df=True)
3.1.4. 4. show all columns as dict¶
import pyhgnc
query = pyhgnc.query()
first_entry = query.hgnc(limit=1)[0]
first_entry.to_dict()
3.1.5. 5. Return single values with key name¶
import pyhgnc
query = pyhgnc.query()
query.hgnc(name='%kinase')[0].name
3.1.6. 6. Access to the linked data models (1-n, n-m)¶
From results of pyhgnc.query().hgnc() you can access
- alias_symbols
- alias_names
- rgds
- omims
- ccdss
- lsdbs
- orthology_predictions
- enzymes
- gene_families
- refseq_accessions
- mgds
- uniprots
- pubmeds
- enas
import pyhgnc
query = pyhgnc.query()
r = query.hgnc(limit=1)[0]
r.alias_symbols
r.alias_names
r.rgds
r.omims
r.ccdss
r.lsdbs
r.orthology_predictions
r.enzymes
r.gene_families
r.refseq_accessions
r.mgds
r.uniprots
r.pubmeds
r.enas
But for example from pyhgnc.query().uniprot() you can go back to hgnc
import pyhgnc
query = pyhgnc.query()
uniprot = query.uniprot(uniprotid='Q9BTE6')[0]
uniprot.hgncs
# [AARSD1, PTGES3L-AARSD1]
# following is crazy but possible, again go back to ec_number
uniprot.hgncs[0].uniprots
# [Q9BTE6]
3.1.7. 7. HGNC identifier and symbol is available in all methods¶
Hint
In all query functions (except hgnc) you have the parameters - hgnc_identifier - hgnc_symbol even it is not part of the model.
import pyhgnc
query = pyhgnc.query()
query.alias_symbol(hgnc_identifier=620)
# [AD1]
query.alias_symbol(hgnc_symbol='APP')
# [AD1]
3.2. hgnc¶
import pyhgnc
query = pyhgnc.query()
query.hgnc(entrez=503538)
Check documentation of pyhgnc.manager.query.QueryManager.hgnc()
for all available parameters.
3.3. orthology_prediction¶
import pyhgnc
query = pyhgnc.query()
query.orthology_prediction(ortholog_species=10090, hgnc_symbol='APP')
# [10090: amyloid beta (A4) precursor protein: App]
Check documentation of pyhgnc.manager.query.QueryManager.orthology_prediction()
for all available parameters.
3.4. alias_symbol¶
import pyhgnc
query = pyhgnc.query()
result = query.alias_symbol(alias_symbol='AD1')[0]
result.hgnc
# APP
Check documentation of pyhgnc.manager.query.QueryManager.alias_symbol()
for all available parameters.
3.5. alias_name¶
import pyhgnc
query = pyhgnc.query()
result = query.alias_name(alias_name='peptidase nexin-II')[0]
result.hgnc.name
# 'amyloid beta precursor protein'
Check documentation of pyhgnc.manager.query.QueryManager.alias_name()
for all available parameters.
3.6. gene_family¶
import pyhgnc
query = pyhgnc.query()
result = query.gene_family(family_name='Parkinson%')[0]
result
# 'Parkinson disease associated genes'
result.hgncs
# [ATP13A2, EIF4G1, FBXO7, HTRA2, LRRK2, PARK3, PARK7, PARK10, PARK11, PARK12, PARK16, PINK1,\
# PLA2G6, PRKN, SNCA, UCHL1, VPS35]
Check documentation of pyhgnc.manager.query.QueryManager.gene_family()
for
all available parameters.
3.7. ref_seq¶
import pyhgnc
query = pyhgnc.query()
query.ref_seq(hgnc_symbol='APP')
# [NM_000484]
Check documentation of pyhgnc.manager.query.QueryManager.ref_seq()
for all
available parameters.
3.8. rgd¶
import pyhgnc
query = pyhgnc.query()
query.rgd(rgdid=2139)[0].hgncs
# [APP]
Check documentation of pyhgnc.manager.query.QueryManager.rgd()
for all available parameters.
3.9. omim¶
import pyhgnc
query = pyhgnc.query()
query.omim(omimid=104760)[0].hgnc.name
# 'amyloid beta precursor protein'
Check documentation of pyhgnc.manager.query.QueryManager.omim()
for all available parameters.
3.10. mgd¶
import pyhgnc
query = pyhgnc.query()
query.mgd(mgdid=88059)[0].hgncs
# [APP]
Check documentation of pyhgnc.manager.query.QueryManager.mgd()
for all available parameters.
3.11. uniprot¶
import pyhgnc
query = pyhgnc.query()
query.uniprot(uniprotid='P05067')[0].hgncs
# [APP]
Check documentation of pyhgnc.manager.query.QueryManager.uniprot()
for all available parameters.
3.12. ccds¶
import pyhgnc
query = pyhgnc.query()
query.ccds(ccdsid='CCDS13576')[0].hgnc
# APP
Check documentation of pyhgnc.manager.query.QueryManager.ccds()
for all available parameters.
3.13. pubmed¶
import pyhgnc
query = pyhgnc.query()
query.pubmed(hgnc_symbol='A1CF')
# [11815617, 11072063]
Check documentation of pyhgnc.manager.query.QueryManager.pubmed()
for all available parameters.
3.14. ena¶
import pyhgnc
query = pyhgnc.query()
query.ena(hgnc_identifier=620)
# [AD1]
Check documentation of pyhgnc.manager.query.QueryManager.ena()
for all available parameters.
3.15. enzyme¶
import pyhgnc
query = pyhgnc.query()
query.enzyme(hgnc_symbol='PRKCA')
# [2.7.11.1]
Check documentation of pyhgnc.manager.query.QueryManager.enzyme()
for all available parameters.
3.16. lsdb¶
import pyhgnc
query = pyhgnc.query()
query.lsdb(hgnc_symbol='APP')
# [Alzheimer Disease & Frontotemporal Dementia Mutation Database]
Check documentation of pyhgnc.manager.query.QueryManager.lsdb()
for all available parameters.