| |
|
- Introduction
-
- Molecular Biology Databases
- Database Search and Sequence Alignment
-
- Multiple Sequence Alignment
-
- Edgar & Batzoglou, Multiple Sequence Alignment, Current Opinion
in Structural Biology 16(3):368-373, June 2006
- Thompson,
et al, CLUSTAL W: Improving the Sensitivity of Progressive Multiple
Sequence Alignment Through Sequence Weighting, Position-Specific Gap
Penalties and Weight Matrix Choice.
Nucleic Acids Res. 1994 November 11; 22(22): 4673
- The ClustalW FAQ
-
Wrabi & Grishin, Gaps in structurally similar proteins: Towards
improvement of multiple sequence alignment Proteins: Structure, Function, and Bioinformatics 2003 54(1): 71 - 87
- Sauder, et al, Large-scale
comparison of protein sequence alignment algorithms with structure
alignments.Proteins. 2000 Jul 1;40(1):6-22.
- Hidden Markov Models
-
- Sequence Assembly
-
- Gene Finding
-
- Burg & Karlin, Prediction of complete gene structures in human genomic DNA Journal of Molecular Biology
Volume 268, Issue 1 , 25 April 1997, Pages 78-94
- Reese, et al,
Genie: Gene Finding in Drosophila melanogaster Genome Research Vol. 10, Issue 4, 529-538, April 2000
- Ashbruner, M, A Biologist's View of the Drosophila Genome Annotation Assessment Project Genome Research Vol. 10, Issue 4, 391-393, April 2000. [NB: The entire special issue on the GRASP competition is of interest.]
- The EGRASP 2006 gene finding evaluation. The entire special issue is worth reading, but you must read at least:
- Korf, et al,'s Twinscan ISMB 2001 paper
- Wu and Haussler, Coding exon detection using comparative sequences (ShortHMM), J. Computational Biology Jul 2006, Vol. 13, No. 6: 1148-1164
- Ohler, Shomron and Burge, Recognition of unknown conserved alternatively spliced exons. PLoS Computational Biology 2005 Jul;1(2):113-22.
- Optional readings
- Arumugam, et al, Pairagon+N-SCAN_EST: a model-based gene annotation pipeline Genome Biology 2006, 7(Suppl 1):S5
- Crollius, et al Estimate of human gene number provided by genome-wide analysis using Tetraodon nigroviridis DNA sequence (ExoFish) Nature Genetics 25, 235 - 238 (2000)
- Yao, et al, Evaluation of five ab initio gene prediction programs for the discovery of maize genes Plant Mol Biol. 2005 Feb;57(3):445-60.
- Fossiac and Schiex, Integrating alternative splicing detection into gene prediction BMC Bioinformatics. 2005; 6: 25.
- Cawley and Pachter, HMM sampling and applications to gene finding and alternative splicing. Bioinformatics. 2003 Oct;19 Suppl 2:II36-II41.
- Carter, Dubchak and Holbrook A computational approach to identify genes for functional RNAs in genomic sequences Nucleic Acids Res. 2001 October 1; 29(19): 3928-3938
- Ohler, et al, Patterns of flanking sequence conservation and a characteristic upstream motif for microRNA gene identification. RNA. 2004 September; 10(9): 1309-1322.
- Protein Structure Prediction
-
- Simons, et al, Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol. 1997 Apr 25;268(1):209-25.
- Ginalski, et al 3D-Jury: a simple approach to improve protein structure predictions. Bioinformatics. 2003 May 22;19(8):1015-8.
- Vincent, et al, Assessment of CASP6 predictions for new and nearly new fold targets [NB: you may find the whole special issue on CASP6 to be of interest.
Proteins: Structure, Function, and Bioinformatics
Volume 61, Issue S7, Pages 67-83
- Chivian, et al, Prediction of CASP6 structures using automated robetta protocols
Proteins: Structure, Function, and Bioinformatics
Volume 61, Issue S7, Pages 157-166
- D. Fisher, Servers for protein structure prediction Current Opinion in Structural Biology
Volume 16, Issue 2 , April 2006, Pages 178-182
- Karplus, et al, SAM-T04: What is new in protein-structure prediction for CASP6 Proteins: Structure, Function, and Bioinformatics
Volume 61, Issue S7, Pages 135-142
- You might also be interested in the preliminary CASP7 results.
- Mechanics, Dynamics & Docking
-
- Computational Phylogeny
-
- Genetic Analysis
-
- Short and opinionated overview
of linkage and association analysis by Robert Elston, one of
the founders of the field. Genetic Epidemiology
15(6):565-576, 1998. This was the 1997 International Genetic
Epidemiology Society Presidential Address [NB: only works from
UCHSC IP address.]
- The slides and notes, from
the first lecture in Gil McVean's outstanding course on population genetics
at Oxford (slides from 2004, notes from 2002 version). I strongly recommend reviewing the rest of the notes and slides from the population
genetics course.
Kent Holsinger also has produced a good set of lecture notes in
population genetics (the PDFs have amusing footnotes missing in the HTML versions). The Holsinger notes are from an undergraduate course, and are easier to follow if you've never had any population genetics before.
- Slides from Terry Speed's outstanding 2007 ISMB keynote
- Lin and Zou, Assessing genomewide statistical significance in linkage studies. Genet Epidemiol. 2004 Nov;27(3):202-14.
- Marchini, et al, Genome-wide strategies for detecting multiple loci that influence complex diseases, Nature Genetics 37:413 - 417 (2005)
- The Introduction to the 14th Genetic Analysis Workshop. You can review the entire collection of associated papers for ones of particular interest to you. GAW 15 happened in 2006, but the papers aren't published yet; some results are available.
- Mailund, et al, Whole genome association mapping by incompatibilities and local perfect phylogeniesBMC Bioinformatics 2006, 7:454
- Wang, et al., Genome-wide Association Studies: Theoretical and Practical Concerns
Nature Reviews Genetics 6, 109-118 (2005)
- The International HapMap Consortium, A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851-861. 2007.
- Barrett, et al, Haploview: analysis and visualization of LD and haplotype maps Bioinformatics 21:2(263-265), 2005.
- Optional:
- Microarrays
-
- Kozarova, A. et al., Array of informatics: Applications in modern research. J Proteome Res. 2006 May;5(5):1051-9.
- Allison, DB, et al., Microarray data analysis: from disarray to consolidation and consensus Nature Reviews Genetics 7, 55-65 (January 2006)
- Kreil, et al, Microarray Oligonucleotide Probes Methods in Enzymology
Volume 410 , 2006, Pages 73-98
- Quackenbush, Microarray data normalization and transformation Nat Genet. 2002 Dec;32 Suppl:496-501.
- Irizarry, Wu and Jaffee, Comparison of Affymetrix GeneChip expression measures Bioinformatics 22 (7): 789. Be sure to review the supplementary material, particularly the table of more recent results, and the related links on the course page.
- Kuo, et al., A sequence-oriented comparison of gene expression measurements across different hybridization-based technologies. Nat Biotechnol. 2006 Jul;24(7):832-40.
-
The Microarray Quality Control consortium issue of Nature Biotechology.
- Review article by UCHSC's Douglas
Curran-Everett, Multiple
comparisons: philosophies and illustrations American Journal
of Physiology - Regulatory, Integrative and Comparative Physiology
Vol. 279, Issue 1, R1-R8, July 2000
- Patrik D'haeseleer How does gene expression clustering work? Nature Biotechnology 23, 1499 - 1501 (2005)
- Subramanian, et al., Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles PNAS | October 25, 2005 | vol. 102 | no. 43 | 15545-15550
Optional: The original GCRMA paper, Wu, et al, A Model Based Background Adjustment for Oligonucleotide Expression Arrays (May 2004). Johns Hopkins University, Dept. of Biostatistics Working Papers. Working Paper 1.
- Durbin, et al, A
variance-stabilizing transformation for gene-expression microarray
data Bioinformatics Vol. 18 no. 9 2002 Pages
S105-S110
- Golub, et al, Molecular
Classification of Cancer: Class Discovery and Class Prediction by
Gene Expression Monitoring Science 1999 Oct
15;286(5439):531-7
- Brown, et al Knowledge-based
analysis of microarray gene expression data by using support vector
machines Proceedings of the National Academy of Sciences USA
97(1):262-267, January 4, 2000
- MacKay & Mishkin, Latent
Variable Models for Gene Expression Data Technical report.
- Brazhnik, et al Gene networks: how to put the function in genomics Trends in Biotechnology
Volume 20, Issue 11 , 1 November 2002, Pages 467-472
- Proteomics
-
- Wysocki, et al., Mass spectrometry of peptides and proteins Methods
Volume 35, Issue 3 , March 2005, Pages 211-222
- Chalkley, et al., Bioinformatic Methods to Exploit Mass Spectrometric Data for Proteomic Applications Methods in Enzymology
Volume 402 , 2005, Pages 289-312
- Li, et al., Data mining techniques for cancer detection using serum proteomic profiling Artificial Intelligence in Medicine
Volume 32, Issue 2 , October 2004, Pages 71-83
- optional review of instruments and relevant biophysics: Lane, Mass spectrometry-based proteomics in the life sciences Cell Mol Life Sci. 2005 Apr;62(7-8):848-69.
- Biomedical Language Processing
-
- Hunter and Cohen, Biomedical Language Processing: What's Beyond PubMed? Molecular Cell 2006 Mar 3;21(5):589-94.
- Draft of our OpenDMAP manuscript.
- Shah, et al., Extraction of Transcript Diversity from Scientific Literature PLoS Comput Biol 1(1): e10
- Horn, et al., Automated extraction of mutation data from the literature: application of MuteXt to G protein-coupled receptors and nuclear hormone receptors. Bioinformatics. 2004 Mar 1;20(4):557-68.
- Hersh, et al., TREC Genomics 2006 Overview from the TREC Genomics site.
- Hirschman, et al., Overview of BioCreAtIvE: critical assessment of information extraction for biology BMC Bioinformatics 2005, 6(Suppl 1):S1. You may also be interested in the other articles from the BMC Bioinformatics special issue on the first BioCreative competition. The issue on the second BioCreative isn't out yet, but the BioCreative web site has some additional information.
- Everything Else
-
- Transcription Factor Binding Sites
- Whole genome comparisons
- Pathways, simulation and systems biology informatics
- Network analysis
- Pharmacoinformatics
- Claus and Underwood, Discovery informatics: its evolving role in drug discovery Drug Discovery Today
Volume 7, Issue 18 , 15 September 2002, Pages 957-966
- Louie, et al., Data Integration and Genomic Medicine Journal of Biomedical Informatics
Volume 40, Issue 1 , February 2007, Pages 5-16
- Dix, et al., The ToxCast Program for Prioritizing Toxicity Testing of Environmental Chemicals. Toxicol Sci. 2006 Sep 8;
- Metabolomics
- Neuroinformatics
- Forensic Bioinformatics
|
|
|