Query¶
Contents
Query interface¶
PyHGNC provides a powerfull query interface for the stored data. It can be accessed from python shell:
import pyhgnc
query = pyhgnc.query()
You can use the query interface instance to issue a query to any model defined in pyhgnc.manager.models:
# Issue query on hgnc table:
query.hgnc()
# Issue query on pubmed table:
query.pubmed()
Hint
See Query functions for more examples and check out pyhgnc.manager.query.QueryManager (below) for
all possible parameters for the different models.
Query Manager Reference¶
-
class
pyhgnc.manager.query.QueryManager(connection=None, echo=False)[source]¶ Query interface to database.
-
alias_name(alias_name=None, is_previous_name=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.AliasNameobjects in databaseParameters: - alias_name (str or tuple(str) or None) – alias name(s)
- is_previous_name (bool or tuple(bool) or None) – flag for ‘is previous’
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.AliasSymbol) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.AliasSymbol) orpandas.DataFrame
-
alias_symbol(alias_symbol=None, is_previous_symbol=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.AliasSymbolobjects in databaseParameters: - alias_symbol (str or tuple(str) or None) – alias symbol(s)
- is_previous_symbol (bool or tuple(bool) or None) – flag for ‘is previous’
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.AliasSymbol) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.AliasSymbol) orpandas.DataFrame
-
ccds(ccdsid=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.CCDSobjects in databaseParameters: - ccdsid (str or tuple(str) or None) – Consensus CDS ID(s)
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.CCDS) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.CCDS) orpandas.DataFrame
-
ena(enaid=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.ENAobjects in databaseParameters: - enaid (str or tuple(str) or None) – European Nucleotide Archive (ENA) identifier(s)
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.ENA) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.ENA) orpandas.DataFrame
-
enzyme(ec_number=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.Enzymeobjects in databaseParameters: - ec_number (str or tuple(str) or None) – Enzyme Commission number (EC number)(s)
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.Enzyme) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.Enzyme) orpandas.DataFrame
-
gene_family(family_identifier=None, family_name=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.GeneFamilyobjects in databaseParameters: - family_identifier (int or tuple(int) or None) – gene family identifier(s)
- family_name (str or tuple(str) or None) – gene family name(s)
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.AliasSymbol) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.AliasSymbol) orpandas.DataFrame
-
get_model_queries(query_obj, model_queries_config)[source]¶ use this if your are searching for a field in the same model
-
hgnc(name=None, symbol=None, identifier=None, status=None, uuid=None, locus_group=None, orphanet=None, locus_type=None, date_name_changed=None, date_modified=None, date_symbol_changed=None, pubmedid=None, date_approved_reserved=None, ensembl_gene=None, horde=None, vega=None, lncrnadb=None, uniprotid=None, entrez=None, mirbase=None, iuphar=None, ucsc=None, snornabase=None, gene_family_name=None, mgdid=None, pseudogeneorg=None, bioparadigmsslc=None, locationsortable=None, ec_number=None, refseq_accession=None, merops=None, location=None, cosmic=None, imgt=None, enaid=None, alias_symbol=None, alias_name=None, rgdid=None, omimid=None, ccdsid=None, lsdbs=None, ortholog_species=None, gene_family_identifier=None, limit=None, as_df=False)[source]¶ Method to query
pyhgnc.manager.models.PmidParameters: - name (str or tuple(str) or None) – HGNC approved name for the gene
- symbol (str or tuple(str) or None) – HGNC approved gene symbol
- identifier (int or tuple(int) or None) – HGNC ID. A unique ID created by the HGNC for every approved symbol
- status (str or tuple(str) or None) – Status of the symbol report, which can be either “Approved” or “Entry Withdrawn”
- uuid (str or tuple(str) or None) – universally unique identifier
- locus_group (str or tuple(str) or None) – group name for a set of related locus types as defined by the HGNC
- orphanet (int ot tuple(int) or None) – Orphanet database identifier (related to rare diseases and orphan drugs)
- locus_type (str or tuple(str) or None) – locus type as defined by the HGNC (e.g. RNA, transfer)
- date_name_changed (str or tuple(str) or None) – date the gene name was last changed (format: YYYY-mm-dd, e.g. 2017-09-29)
- date_modified (str or tuple(str) or None) – date the entry was last modified (format: YYYY-mm-dd, e.g. 2017-09-29)
- date_symbol_changed (str or tuple(str) or None) – date the gene symbol was last changed (format: YYYY-mm-dd, e.g. 2017-09-29)
- date_approved_reserved (str or tuple(str) or None) – date the entry was first approved (format: YYYY-mm-dd, e.g. 2017-09-29)
- pubmedid (int ot tuple(int) or None) – PubMed identifier
- ensembl_gene (str or tuple(str) or None) – Ensembl gene ID. Found within the “GENE RESOURCES” section of the gene symbol report
- horde (str or tuple(str) or None) – symbol used within HORDE for the gene (not available in JSON)
- vega (str or tuple(str) or None) – Vega gene ID. Found within the “GENE RESOURCES” section of the gene symbol report
- lncrnadb (str or tuple(str) or None) – Noncoding RNA Database identifier
- uniprotid (str or tuple(str) or None) – UniProt identifier
- entrez (str or tuple(str) or None) – Entrez gene ID. Found within the “GENE RESOURCES” section of the gene symbol report
- mirbase (str or tuple(str) or None) – miRBase ID
- iuphar (str or tuple(str) or None) – The objectId used to link to the IUPHAR/BPS Guide to PHARMACOLOGY database
- ucsc (str or tuple(str) or None) – UCSC gene ID. Found within the “GENE RESOURCES” section of the gene symbol report
- snornabase (str or tuple(str) or None) – snoRNABase ID
- gene_family_name (int or tuple(int) or None) – Gene family name
- gene_family_identifier – Gene family identifier
- mgdid (int ot tuple(int) or None) – Mouse Genome Database identifier
- imgt (str or tuple(str) or None) – Symbol used within international ImMunoGeneTics information system
- enaid (str or tuple(str) or None) – European Nucleotide Archive (ENA) identifier
- alias_symbol (str or tuple(str) or None) – Other symbols used to refer to a gene
- alias_name (str or tuple(str) or None) – Other names used to refer to a gene
- pseudogeneorg (str or tuple(str) or None) – Pseudogene.org ID
- bioparadigmsslc (str or tuple(str) or None) – Symbol used to link to the SLC tables database at bioparadigms.org for the gene
- locationsortable (str or tuple(str) or None) – locations sortable
- ec_number (str or tuple(str) or None) – Enzyme Commission number (EC number)
- refseq_accession (str or tuple(str) or None) – RefSeq nucleotide accession(s)
- merops (str or tuple(str) or None) – ID used to link to the MEROPS peptidase database
- location (str or tuple(str) or None) – Cytogenetic location of the gene (e.g. 2q34).
- cosmic (str or tuple(str) or None) – Symbol used within the Catalogue of somatic mutations in cancer for the gene
- rgdid (int or tuple(int) or None) – Rat genome database gene ID
- omimid (int or tuple(int) or None) – Online Mendelian Inheritance in Man (OMIM) ID
- ccdsid (str or tuple(str) or None) – Consensus CDS ID
- lsdbs (str or tuple(str) or None) – Locus Specific Mutation Database Name
- ortholog_species (int or tuple(int) or None) – Ortholog species NCBI taxonomy identifier
- limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.Keyword) - if as_df == True ->
pandas.DataFrame
Return type:
-
lsdb(lsdb=None, url=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.LSDBobjects in databaseParameters: - lsdb (str or tuple(str) or None) – name(s) of the Locus Specific Mutation Database
- url (str or tuple(str) or None) – URL of the Locus Specific Mutation Database
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.LSDB) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.LSDB) orpandas.DataFrame
-
mgd(mgdid=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.MGDobjects in databaseParameters: - mgdid (str or tuple(str) or None) – Mouse genome informatics database ID(s)
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.MGD) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.MGD) orpandas.DataFrame
-
omim(omimid=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.OMIMobjects in databaseParameters: - omimid (str or tuple(str) or None) – Online Mendelian Inheritance in Man (OMIM) ID(s)
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.OMIM) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.OMIM) orpandas.DataFrame
-
orthology_prediction(ortholog_species=None, human_entrez_gene=None, human_ensembl_gene=None, human_name=None, human_symbol=None, human_chr=None, human_assert_ids=None, ortholog_species_entrez_gene=None, ortholog_species_ensembl_gene=None, ortholog_species_db_id=None, ortholog_species_name=None, ortholog_species_symbol=None, ortholog_species_chr=None, ortholog_species_assert_ids=None, support=None, hgnc_identifier=None, hgnc_symbol=None, limit=None, as_df=False)[source]¶ Method to query
pyhgnc.manager.models.OrthologyPredictionParameters: - ortholog_species (int) – NCBI taxonomy identifier
- human_entrez_gene (str) – Entrez gene identifier
- human_ensembl_gene (str) – Ensembl identifier
- human_name (str) – human gene name
- human_symbol (str) – human gene symbol
- human_chr (str) – human chromosome
- human_assert_ids (str) –
- ortholog_species_entrez_gene (str) – Entrez gene identifier for ortholog
- ortholog_species_ensembl_gene (str) – Ensembl gene identifier for ortholog
- ortholog_species_db_id (str) – Species specific database identifier (e.g. MGI:1920453)
- ortholog_species_name (str) – gene name of ortholog
- ortholog_species_symbol (str) – gene symbol of ortholog
- ortholog_species_chr (str) – chromosome identifier (ortholog)
- ortholog_species_assert_ids (str) –
- support (str) –
- hgnc_identifier (int) – HGNC identifier
- hgnc_symbol (str) – HGNC symbol
- limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.Keyword) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.Keyword) orpandas.DataFrame
-
pubmed(pubmedid=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.PubMedobjects in databaseParameters: - pubmedid (str or tuple(str) or None) – alias symbol(s)
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.PubMed) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.PubMed) orpandas.DataFrame
-
ref_seq(accession=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.RefSeqobjects in databaseParameters: - accession (str or tuple(str) or None) – RefSeq accessionl(s)
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.RefSeq) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.RefSeq) orpandas.DataFrame
-
rgd(rgdid=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.RGDobjects in databaseParameters: - rgdid (str or tuple(str) or None) – Rat genome database gene ID(s)
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.RGD) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.RGD) orpandas.DataFrame
-
uniprot(uniprotid=None, hgnc_symbol=None, hgnc_identifier=None, limit=None, as_df=False)[source]¶ Method to query
models.UniProtobjects in databaseParameters: - uniprotid (str or tuple(str) or None) – UniProt identifier(s)
- hgnc_symbol (str or tuple(str) or None) – HGNC symbol(s)
- hgnc_identifier (int or tuple(int) or None) – identifiers(s) in
models.HGNC - limit (int or tuple(int) or None) –
- if isinstance(limit,int)==True -> limit
- if isinstance(limit,tuple)==True -> format:= tuple(page_number, results_per_page)
- if limit == None -> all results
- as_df (bool) – if True results are returned as
pandas.DataFrame
Returns: - if as_df == False -> list(
models.UniProt) - if as_df == True ->
pandas.DataFrame
Return type: list(
models.UniProt) orpandas.DataFrame
-