CS 6746, Fall 2003:
Research Seminar on Artificial Intelligence - Main topics: Clustering and Classification

Organizer:
Weixiong Zhang
Meeting time and place:
every Friday (see schedule below), 1pm - 2pm (we start at 1pm sharp!!), Bryan 509C
Requirement:
read every paper and be prepared!

Schedule

Date Presenter
paper #
9/12 Jianhua Ruan
17
9/19 Danna Gurari
12
9/26 Chaochun Wei
9
10/3 David Jurgens
21
10/10 No meeting
10/17 No meeting
10/24
No meeting
10/31 Guandong Wang
3
11/7
Xuefeng Zhou
11/14
Zhao Xing
11/21
Eric Tenney
11/28
Thanksgiving (no meeting)
12/5
Sharlee Climer

Recommended reading list


Background reading (not for presentation): One of the best (and classical) books on classification and clustering is Pattern Classification and Scene Analysis by Richard Duda and Peter Hart, John Wiley & Son, 1973.  A preprint is available at amazon.com.  A (not so) recent survey is here: A.K. Jain, M.N. Murty and P.J. Flynn, Data Clustering: A review, ACM Computing Surveys, 31(3) 1999, pp264-323.  The order of the following papers is arbitrary.  So pick one you like.  I suggest that you scan several papers, at least read their abstracts, before making your decide.  Again you are very welcome and encouraged to select a paper related to your research topic.  If you have some papers that you like to add to this reading list, please send me the titles and links to the soft copies of the papers.

  1. Marcel Dettling and Peter Bühlmann, Supervised clustering of genes, Genome Biology 2002 3:research0069.1-0069.15
  2. ** Ron Shamir, Roded Sharan, Algorithmic approaches to clustering gene expression data, Current Topics in Computational Biology
  3. Yizong Cheng, George M. Church, Biclustering of Expression Data, ISMB 2000.
  4. Zhaohui S. Qin, et al., Identification of co-regulated genes through Bayesian clustering of predicted regulatory binding sites, Nature Biotechnology
  5. E. Segal, R. Yelensky, and D. Koller, Genome-wide Discovery of Transcriptional Modules from DNA Sequence and Gene Expression, Bioinformatics, 2003, 19: 1273-82
  6. * Gasch AP and Eisen MB (2002). Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biology 3(11), 1-22.
  7. S. Sinha, Discriminative motifs, Proceedings of the sixth annual international conference on Computational biology, 2002
  8. ** Berman BP, Nubu Y, Pfeiffer BD, Tomancak P, Celniker SE, Levine M, Rubin GM and Eisen MB (2002). Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome. Proc Natl Acad Sci U S A 99, 757-62.
  9. Ying Xu , Victor Olman and Dong Xu, Clustering gene expression data using a graph-theoretic approach: an application of minimum spanning trees, Bioinformatics Vol. 18 no. 4 2002, Pages 536-545
  10. Keles S, van der Laan M and Eisen MB (2002). Identification of regulatory elements using a feature selection method. Bioinformatics 18, 1167-75.
  11. ** Eisen MB, Spellman PT, Brown PO and Botstein D. (1998). Cluster Analysis and Display of Genome-Wide Expression Patterns. Proc Natl Acad Sci U S A 95, 14863-8
  12. Umar Syed, Golan Yona, Using a mixture of probabilistic decision trees for direct prediction of protein function, 2003, Proceedings of the seventh annual international conference on Computational molecular biology
  13. * Aik Choon Tan, David Gilbert, An empirical comparison of supervised machine learning techniques in bioinformatics, Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics 2003 - Volume 19
  14. Isabelle Guyon, Jason Weston, Stephen Barnhill, Vladimir Vapnik, Gene Selection for Cancer Classification using Support Vector Machines,  Machine Learning,  Volume 46 Issue 1-3, 2002
  15. * Bianca Zadrozny, Charles Elkan, Learning and making decisions when costs and probabilities are both unknown, Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001
  16. A. Ben-Dor, L. Bruhn, N. Friedman, I. Nachman, M. Schummer, and Z. Yakhini, Tissue Classification with Gene Expression Profiles, J. Computational Biology, 7: 559-584, 2000
  17. Ljupčo Todorovski, Sašo Džeroski, Combining Classifiers with Meta Decision Trees, Machine Learning,  Volume 50 Issue 3, 2003
  18. Gabriela G. Loots,Ivan Ovcharenko, Lior Pachter, Inna Dubchak, and Edward M. Rubin, rVista for Comparative Sequence-Based Discovery of Functional Transcription Factor Binding Sites, Genome Res. 2002 May;12(5):832-9.
  19. N. Friedman, and Y. Barash, Context-Specific Bayesian Clustering for Gene Expression Data , Journal of Computational Biology, 9:169-191, 2002.
  20. N. Friedman, and M. Ninio, I. Pe'er, and T. Pupko, A Structural EM Algorithm for Phylogentic Inference .Journal of Computational Biology, 9:331-353, 2002
  21. N. Friedman,  D. Geiger, and M. Goldszmidt, Bayesian networks classifiers  In Machine Learning 29:131--163, 1997.
  22. N. Friedman, T. Pupko,  I. Pe'er, M. Hasegawa, and D. Graur. A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites Bioinformatics, 18:1116-1123, 2002