Bioinformatics
Bioinformatics, an application of information technology on experimental biological data to store, collect, retrieve and of course analysis, to predict composition of molecules (nucleic acids, proteins, etc.), and to model biological systems through mathematical, statistical and computer methods.
Bioinformatics is a union of biology and informatics and involves technology that uses computers for storage, retrieval, manipulation, and distribution of information related to biological macromolecules such as DNA, RNA, and proteins.
The emphasis is on use of computers because most of the tasks in genomic data analysis are highly repetitive or mathematically complex. The use of computers is absolutely indispensable in mining genomes for information gathering and knowledge building.
Bioinformatics is limited to sequence, structural, and functional analysis of genes and genomes and their corresponding products and is often considered computational molecular biology. However, computational biology encompasses all biological areas that involve computation. For example, mathematical modeling of ecosystems, population dynamics, application of the game theory in behavioral studies, and phylogenetic construction using fossil records all employ computational tools, but do not necessarily involve biological macromolecules.
Origin of Bioinformatics
The term "bioinformatics" is a relatively recent invention, not appearing in the literature until 1991 and then only in the context of the emergence of electronic publishing... "...However, some of my role models when I was a graduate student (Margaret O. Dayhoff, Russell F. Doolittle, Walter M. Fitch and Andrew D. McLachlan) had been building databases, developing algorithms and making biological discoveries by sequence analysis since the 1960s---long before anyone thought to label this activity with a special term (if anything it was called `molecular evolution'). Even a relatively new kid on the block, the National Center for Biotechnology Information (NCBI), is celebrating its 10th anniversary this year, having been written into existence by US Congressman Claude Pepper and President Ronald Reagan in 1988. So bioinformatics has, in fact, been in existence for more than 30 years and is now middle-aged."
Ref: Mark S. Boguski's article in the "Trends Guide to Bioinformatics" Elsevier, Trends Supplement 1998 p1:
Goal of bioinformatics
The goal of bioinformatics is to uncover the wealth of biological information hidden in the mass of sequence, structure, literature and other biological data and obtains a clearer insight into the fundamental biology of organisms and to use this information. It is being used now and in future in the areas of molecular medicine to help produce better and more customized medicines to prevent or cure diseases, it has environmental benefits in, identifying waste cleanup bacteria and in agriculture it can be used for producing high yield low maintenance crops.
These are just a few of the many benefits bioinformatics will help develop.
By analyzing raw molecular sequence and structural data, bioinformatics research can generate new insights and provide a "global" perspective of the cell. The reason that the functions of a cell can be better understood by analyzing sequence data is ultimately because the flow of genetic information is dictated by the "central dogma" of biology in which DNA is transcribed to RNA, which is translated to proteins. Cellular functions are mainly performed by proteins whose capabilities are ultimately determined by their sequences. Therefore, solving functional problems using sequence and sometimes structural approaches has proved to be a fruitful endeavor.
Importance of bioinformatics
The greatest challenge facing by molecular biology community today is to make sense of the wealth of data that has been produced by the genome sequencing projects. TrAllumeztionally, molecular biology research was carried out entirely at the experimental laboratory bench but the huge increase in the scale of data being produced in this genomic era has seen a need to incorporate computers into this research process.
Sequence generation, and its subsequent storage, interpretation and analysis are entirely computer dependent tasks.
However, the molecular biology of an organism is a very complex issue with research being carried out at different levels including the genome, proteome, transcriptome and metabalome levels. Following on from the explosion in volume of genomic data, similar increase in data have been observed in the fields of proteomics, transcriptomics and metabalomics.
The first challenge facing the bioinformatics community today is the intelligent and efficient storage of this mass of data. It is then their responsibility to provide easy and reliable access to this data. The data itself is meaningless before analysis and the sheer volume present makes it impossible for even a trained biologist to begin to interpret it manually. Therefore, incisive computer tools must be developed to allow the extraction of meaningful biological information.
There are three central biological processes around which bioinformatics tools must be developed:
• DNA sequence determines protein sequence
• Protein sequence determines protein structure
• Protein structure determines protein function
The integration of information learned about these key biological processes should allow us to achieve the long term goal of the complete understanding of the biology of organisms.
Scope of Bioinformatics
Bioinformatics consists of two subfields: the development of computational/Biological tools and databases and the application of these tools and databases in generating biological knowledge to better understand living systems. These two subfields are complementary to each other. The tool development includes writing software for sequence, structural, and functional analysis, as well as construction and curation of biological databases. These tools are used in three areas of genomic and molecular biological research: molecular sequence analysis, molecular structural analysis, and molecular functional analysis.
The analyses of biological data often generate new problems and challenges that in turn spur the development of new and better computational tools. The areas of sequence analysis include sequence alignment, sequence database searching, motif and pattern discovery, gene and promoter finding, reconstruction of evolutionary relationships, and genome assembly and comparison. Structural analyses include protein and nucleic acid structure analysis, comparison, classification, and prediction. The functional analyses include gene expression profiling, protein-protein interaction prediction, protein subcellular localization prediction, metabolic pathway reconstruction, and simulation.
The three aspects of bioinformatics analysis are not isolated but often interact to produce integrated results. For example, protein structure prediction depends on sequence alignment data; clustering of gene expression profiles requires the use of phylogenetic tree construction methods derived in sequence analysis. Sequence-based promoter prediction is related to functional analysis of co-expressed genes. Gene annotation involves a number of activities, which include distinction between coding and noncoding sequences, identification of translated protein sequences, and determination of the gene`s evolutionary relationship with other known genes; prediction of its cellular functions employs tools from all three groups of the analyses.
New Themes in Bioinformatics
There is no doubt that bioinformatics is a field that has revolutionized the biological research and holds great potential for revolutionizing research in coming decades. Currently, the field is undergoing major expansion. In addition to providing more reliable and more rigorous computational tools for sequence, structural, and functional analysis, the major challenge for future bioinformatics development is to develop tools for elucidation of the functions and interactions of all gene products in a cell. This presents a tremendous challenge because it requires integration of disparate fields of biological knowledge and a variety of complex mathematical and statistical tools.
To gain a deeper understanding of cellular functions, mathematical models are needed to simulate a wide variety of intracellular reactions and interactions at the whole cell level. This molecular simulation of all the cellular processes is termed systems biology. Achieving this goal will represent a major leap toward fully understanding a living system. That is why the system-level simulation and integration are considered the future of bioinformatics. Modeling such complex networks and making predictions about their behavior present tremendous challenges and opportunities for bioinformaticians. The ultimate goal of this endeavor is to transform biology from a qualitative science to a quantitative and predictive science. This is truly an exciting time for bioinformatics.