Q1. Basic knowledge of DNA, RNA and protein is important for better understanding of the subject. The following questions will help you get an overall idea of the basic biological molecules and databases.
a) What is central dogma? Explain the process briefly and the importance of DNA, RNA and protein in protein synthesis.
b) List 3 primary sequence databases along with their respective URL’s for obtaining DNA and Protein sequence.
c) List 3 protein structure databases with their respective URL’s.
d) What is Genome? List three genome databases with their respective URL’s.
Q2. Bioinformatics is an intelligent method for obtaining biological knowledge using computational techniques. In this question you will execute a workflow to produce a biological outcome
Overview of the Central Dogma of Life
The central dogma of life is the flow of genetic information, from DNA to RNA, and RNA to Protein which is a functional product. [“Yourgenome” (n.d).]
Three processes are involved in conversion of DNA → RNA → protein. [David, L.N & Michael, M.C (2004)]
- The first step is “Replication”, in this process new DNA is made from existing DNA or it can be said that parental DNA is copied to make a daughter DNA molecule.
- The second step is “Transcription”, in this process new RNA is made from DNA , means the genetic information encoded in DNA is copied into RNA i.e. messenger RNA.
- The third step is “Translation”, in this process new protein is made from RNA, means the genetic message encoded in messenger RNA is translated on the ribosomes into a polypeptide with a particular sequence of amino acids.
So these three components play a crucial role in the central dogma of life. [ Lodish, H., Berk, A. & Zipursky, S.L., et al (2000).]
- DNA(Deoxyribose Nucleic Acid) to store genetic information and carry the information from one generation to other.
- RNA(Ribose Nucleic Acid) is a messenger that carries this information to the ribosomes, three kind of RNAs are involved Messenger RNA (mRNA) carries the genetic information copied from DNA, Transfer RNA (tRNA) converts the words which are coded into mRNA, and Ribosomal RNA (rRNA) associates with a set of proteins to ribosomes is formed.
- Proteinsare the building blocks of body and an important task for the cell function is accomplished by proteins.
- List 3 primary sequence databases along with their respective URL’s for obtaining DNA and Protein sequence.
A database is a computerize archive used to store and organize information so that it can be accessed easily, also the management and updation of data can be done. Biological Databases are libraries of life sciences information, where data is collected from scientific experiments, published literature, high throughput experiment technology, and computational analyses.
The Biological databases can be classified into three types [Jin, X. (2006)]:
- Primary databases: It contains original biological data submitted by scientists and scientific institutes.
- Secondary databases: It contains modified information of original biological data which is computationally processed or manually curated.
- Specialized database: It contains information about particular and specific research interest or particular data of organism.
Sequence database consist of DNA or protein sequences and information about those sequences. A primary sequence database contains molecular biology data in its original format. The primary sequence databases contain a mixture of data that is information of many different organisms which includes whole genome sequences, gene sequences derived from genomic DNA or mRNA (cDNA), sequences of chromosomes, complete or partial sequences and annotated/un-annotated entries with established/predicted functions. Therefore, large number of entries is screened to identify the sequence of interest from primary database.
The three primary sequence databases are [“Biological Databases” (n.d)]:
- GenBank, it is a sequence database in NCBI. The URL of the database is https://www.ncbi.nlm.nih.gov/genbank/.
- EMBL (European Molecular Biology Laboratory), it is a sequence database in Europe. The URL of the database is https://www.embl.org/.
- UniProtKB, it is a knowledge base database which consists of information about protein sequences and functions. The URL of the database is https://www.uniprot.org/.
- List 3 protein structure databases with their respective URL’s.
Proteins can be defined as the long chains of amino acids. They are linear, unbranched polymers which are made of amino acids chains. The function of the protein invariably depends upon interactions with other molecules which can lead to protein confirmation changes.
The structural organization of protein can be classified into four categories [Particle Science Drug Development Services (2009) ]:
- Primary structure: The amino acid sequence specified by genetic information.
- Secondary structure: They are the folds of polypeptide chain that forms certain localized arrangements which are adjacent amino acids. Secondary structures are of two types’ α-helix and β-sheets.
Tertiary structure: The overall three-dimensional shape of entire protein molecule.- Quaternary structure: The protein consists of multiple same or different polypeptide chains or protein subunits.
The protein structure databases have wide variety of information about the three-dimensional structure and functions of proteins.
The three protein structure databases are [“Israel Science and Technology Directory” (n.d)]:
- PDB (Protein Data Bank) is the worldwide central repository of protein structural information. The URL of the database is https://www.rcsb.org/.
- SCOPe (Structural Classification of Proteins - extended), Contains information about classification of protein structures and within that classification, their sequences. The URL of the database is https://scop.berkeley.edu/.
- CATH/Gene3D, database contains information of Class, Architecture, Topology, and Homology of protein structure. The URL of the database is https://www.cathdb.info/.
- What is Genome? List three genome databases with their respective URL’s.
Genome is a combination of the words “gene” and “chromosome,” where gene is the biological unit of heredity and chromosome is the carrier of genetic information in the form of genes, so genome is defined as the complete set of hereditary instructions/information in each cell of living organism that is needed for its development and growth.
The set of instructions/information is made up of DNA. There is a unique genome is all living organisms. For example the size of the human genome is very large and it consists of 3.2 billion bases of DNA but a genome size differs in other organisms [“Yourgenome” (n.d)].
Primary Sequence Databases
Genome databases are an organized collection of information that have resulted from the production or mapping of genome (sequence) or genome product (transcript, protein) information. Genome databases contain a variety of biological information.
There are different types of genome databases like Human Genome Databases, Model Organism Databases (MOD), Other Organism Databases, Organelle Databases, and Virus Databases [Jin, X. (2006)].
The three genome databases are [Winston, H. (2005)]:
- MaizeGDB (Maize Genome Database), this database mainly focus on crop plant maize and model organism Zea mays. The URL of the database is https://www.maizegdb.org/#.
- OMIM (Online Mendelian Inheritance in Man), it’s a whole genome database and a catalogue of human genes and genetic disorders. The URL of the database is https://www.omim.org/.
- Ensembl, it is a genome browser for vertebrate genomes that supports research in comparative genomics, evolution, sequence variation and transcriptional regulation.The URL of the database is https://asia.ensembl.org/index.html.
Q2. Bioinformatics is an intelligent method for obtaining biological knowledge using computational techniques. In this question you will execute a workflow to produce a biological outcome.
We will investigate Myosin VI a protein involved in cancer in humans.
- Using the NCBI Gene Database, investigate Myosin VI for Homo sapiens. This display has a lot of information, list the information you infer about the particular gene.
Brief Function of the gene (In your own words).
Myosin is actin base protein. The gene involved in Myosin VI and Homo sapiens is MYO6 it is the protein coding gene. The function of gene is that it plays role in various intracellular processes like cell migration and membrane trafficking. A reverse-direction motor protein is encoded by the gene which helps it moves towards the minus end of actin filaments; it also plays an important role in intracellular vesicle and organelle transport.
- What are the major pathways involved and how many exons are present?
The major pathways information is obtained from Pathways from BioSystems
- Gap junction degradation, organism-specific biosystem(from REACTOME)
- Gap junction trafficking, organism-specific biosystem(from REACTOME)
- Gap junction trafficking and regulation, organism-specific biosystem(from REACTOME)
Glutamate Binding, Activation of AMPA Receptors and Synaptic Plasticity, organism-specific biosystem(from REACTOME)- Membrane Trafficking, organism-specific biosystem(from REACTOME)
- Neuronal System, organism-specific biosystem(from REACTOME)
- Neurotransmitter Receptor Binding And Downstream Transmission In The Postsynaptic Cell, organism-specific biosystem(from REACTOME)
- Stabilization and expansion of the E-cadherin adherens junction, organism-specific biosystem(from Pathway Interaction Database)
- Trafficking of AMPA receptors, organism-specific biosystem(from REACTOME)
- Transmission across Chemical Synapses, organism-specific biosystem(from REACTOME)
- Vesicle-mediated transport, organism-specific biosystem(from REACTOME)
Exon count obtained in the MYO6 genes is 37.
- Scroll to the (NCBI Reference Sequences) section and click the protein sequence, which will begin with NP_ You, will be taken to the entry in NCBI Protein. Select “FASTA”, Click ‘Send to File’ to save the protein sequence.
As the two protein sequences are obtained from NCBI Reference Sequences result:
First is NCBI Reference Sequence: NP_004990.3, defined as unconventional myosin-VI isoform 1 [Homo sapiens].
Second is NCBI Reference Sequence: NP_001287828.1, defined as unconventional myosin-VI isoform 2 [Homo sapiens].
- Paste your sequence into the structure model serverhttps://swissmodel.expasy.org/interactive. 3D structure of your protein will be generated. Provide the screenshot of your result page.
By using Swiss-Model the three-dimensional structure of the proteins are generated of both the RefSeq’s. The screenshots of the results are displayed below.
RESULTS:
Five models were generated by using the given FASTA sequence obtained from unconventional myosin-VI isoform 1 [Homo sapiens], RefSeq: NP_004990.3.
Six models were generated by using the given FASTA sequence obtained from unconventional myosin-VI isoform 2 [Homo sapiens], RefSeq: NP_001287828.1.
Download the PDB file of the model generated and provide the .PDB file.
Five models were generated for unconventional myosin-VI isoform 1 [Homo sapiens], RefSeq: NP_004990.3, by using SWISS-MODEL and .PDB files are provided with this document as model_01.pdb, model_02.pdb, model_03.pdb, model_04.pdb, and model_05.pdb.
Six models were generated for unconventional myosin-VI isoform 2 [Homo sapiens], RefSeq: NP_001287828.1, by using SWISS-MODEL and .PDB files are provided with this document as model_01.pdb, model_02.pdb, model_03.pdb, model_04.pdb, model_05.pdb, and model_06.pdb.
- Analyze the report provided in the results page and provide the following information.
- What is the template used in building the model and state the identity of the target with the template.
- The template used in building the models for unconventional myosin-VI isoform 1 [Homo sapiens], RefSeq: NP_004990.3 are:
SR NO. |
TEMPLETS |
DESCRIPTION |
SEQ IDENTITY |
1 |
2bki.1.A |
UNCONVENTIONAL MYOSIN |
98.25 |
2 |
4anj.1.A |
UNCONVENTIONAL MYOSIN-VI, GREEN FLUORESCENT PROTEIN |
97.92 |
3 |
2kia.1.A |
Myosin-VI |
94.57 |
4 |
2n12.1.A |
Unconventional myosin-VI |
100.00 |
5 |
2n11.1.A |
Unconventional myosin-VI |
100.00 |
The template used in building the models for unconventional myosin-VI isoform 2 [Homo sapiens], RefSeq: NP_001287828.1 are:
SR NO. |
TEMPLETS |
DESCRIPTION |
SEQ IDENTITY |
1 |
2bki.1.A |
UNCONVENTIONAL MYOSIN |
98.25 |
2 |
4anj.1.A |
UNCONVENTIONAL MYOSIN-VI, GREEN FLUORESCENT PROTEIN |
97.92 |
3 |
2kia.1.A |
Myosin-VI |
94.57 |
4 |
2n12.1.A |
Unconventional myosin-VI |
98.48 |
5 |
5aj4.19.A |
MITORIBOSOMAL PROTEIN MS26, MRPS26 |
16.85 |
6 |
2n11.1.A |
Unconventional myosin-VI |
97.50 |
Provide the sequence alignment of the target and the template. (Check the model report provided in the results page).
The sequence alignment of the target and the template for unconventional myosin-VI isoform 1 [Homo sapiens], RefSeq: NP_004990.3 is provided in the screen shorts given below.
References
The Central Dogma (n.d). Retrieved from https://pos-darwinista.blogspot.in/2017/03/modelos-de-expressao-global-de-gene.html.
Lodish, H., Berk, A. & Zipursky, S.L., et al (2000). “Molecular Cell Biology”. New York: W. H. Freeman. Retrieved from https://www.ncbi.nlm.nih.gov/books/NBK21603/
“Yourgenome” (n.d). Retrieved from https://www.yourgenome.org/facts/what-is-the-central-dogma.
David, L.N & Michael, M.C (2004). “Lehninger Principles of Biochemistry”. University of Wisconsin–Madison.
“Israel Science and Technology Directory” (n.d). Retrieved from https://www.science.co.il/biomedical/databases/Protein-databases.php
Jin, X. (2006). “Essential Bioinformatics”. Cambridge university press.
“Biological Databases” (n.d). Retrieved from https://bioinf.comav.upv.es/courses/biotech3/theory/databases.html
Particle Science Drug Development Services (2009). Protein Structure. Retrieved from https://www.particlesciences.com/docs/technical_briefs/TB_8.pdf
Winston, H. (2005). “Genome Databases”. ENCYCLOPEDIA OF LIFE SCIENCES John Wiley & Sons, Ltd. doi: 10.1038/npg.els.0005314
To export a reference to this article please select a referencing stye below:
My Assignment Help. (2020). Understanding DNA, RNA, Proteins, And Biological Databases Is Essential For Essay.. Retrieved from https://myassignmenthelp.com/free-samples/cse5bio-bioinformatics-technologies-for-basic-biological-molecules.
"Understanding DNA, RNA, Proteins, And Biological Databases Is Essential For Essay.." My Assignment Help, 2020, https://myassignmenthelp.com/free-samples/cse5bio-bioinformatics-technologies-for-basic-biological-molecules.
My Assignment Help (2020) Understanding DNA, RNA, Proteins, And Biological Databases Is Essential For Essay. [Online]. Available from: https://myassignmenthelp.com/free-samples/cse5bio-bioinformatics-technologies-for-basic-biological-molecules
[Accessed 18 December 2024].
My Assignment Help. 'Understanding DNA, RNA, Proteins, And Biological Databases Is Essential For Essay.' (My Assignment Help, 2020) <https://myassignmenthelp.com/free-samples/cse5bio-bioinformatics-technologies-for-basic-biological-molecules> accessed 18 December 2024.
My Assignment Help. Understanding DNA, RNA, Proteins, And Biological Databases Is Essential For Essay. [Internet]. My Assignment Help. 2020 [cited 18 December 2024]. Available from: https://myassignmenthelp.com/free-samples/cse5bio-bioinformatics-technologies-for-basic-biological-molecules.