Fall 2009 - String Algorithms and Algrorithms in Computational Biology - Gusfield

  • This index page will just link to the various course handouts that are available on the web, and provide some description of them.

    Distribution List

    1. Class tuesday Oct. 11 is cancelled. Watch the video on least common ancestor algorithm. Class on thursday is scheduled as normal - unless you hear from me.
    2. Course Syllabus (brief)
    3. Homework 1, Due Oct. 6, but don't wait so long to start.
    4. Notes on the Z-algorithm (in pdf now)
    5. Notes on Boyer-Moore
    6. First Notes on Suffix trees
    7. Notes on Lempel-Ziv string compression using suffix trees
    8. Notes on suffix arrays
    9. Notes on computing the LCP ( or Depth) array in linear time
    10. Homework 2 (typo corrected in problem 1), Due Oct. 15, but don't wait so long to start.
    11. Solution to the RNA matching count problem on HW 1
    12. Perl program to count the number of matchings. Try it out and see if you find an error

    13. get Taxonomy of Suffix Array Construction Algorithms here (paper 41 in the list)
    14. Replacing suffix trees with suffix arrays Possible paper for student presentation.
    15. The linear-time construction algorithm for suffix arrays we discussed in class is discussed in a more general setting in "Linear work suffix array construction", Journal of the ACM (JACM), Volume 53 , Issue 6, November 2006, Pages: 918 - 936.

    16. list of links to videos of lectures for CS 222A Fall 2007
      The Lecture on November 21 is on the Z-algorithm. The Lecture on November 30 and the start of the Lecture on December 3 cover a linear-time preprocessing, constant-time lookup algorithm for the Least Common Ancestor (LCA) in a tree.
    17. Notes on the K-common substrings problem and linear-time solution
    18. We will not discuss edit distance and string alignment in detail in this class, since most people have already been exposed to these topics, and because they have simple solutions via DP (the local alignment problem is more complex). If you have never seen the DP for alignment or edit distance, see the video lecture of Oct. 8 2007 in the cs 222A videos. Or, see the videos posted for CS 124, lectures 5 through 11.
    19. Notes on the Unique Decypherability Problem
    20. Perl program for Unique Decypherability Problem (please test for errors)
    21. Homework 3, Due Oct. 27, but don't wait so long to start.
    22. Notes on the Perfect Phylogeny Problem This is an extended version of the notes put here earlier. Sections on The Splits Equivalence Theorem and 3-state perfect phylogeny were added to the original notes October 22.

    23. As mentioned in class, good sources of papers in combinatorial string and computational biology algorithms are: the Combinatorial Pattern Matching (CPM) Conference, the Workshop on Algorithms in Bioinformatics (WABI) Conference, the ISBRA (International Symposium on Bioinformatics Research and Applications) Conference.
    24. Solutions to homework 1.
    25. Solution to the first problem of homework 2.
    26. Homework 4 corrected. Due Nov. 5, although that might not be enough time. Start soon and let me know. Problem 4 had an error in it, and is now corected. In short, one needs a column in M1 and M2 for each node in T1 and T2, including leaf nodes.
    27. Introduction to Recombination and ARGs
    29. Additional list of possible student presentation topics - choose soon. Also see Neighbor-joining paper This shows that if the distances are additive, then the NJ algorithm will create the correct additive tree. The Pachter paper should do this, and more, also. But this might be simpler for this simpler result.
    30. Solutions to homework 2.
    31. Notes on lower bounds
    32. You can hand in HW 4 in class next tuesday.
    33. Homework 5. (five problems now - so reload if you downloaded before 12:13 am Nov. 11.) Due Nov. 19.
    34. Notes on Fundamental Combinatorial Structures Corrected Nov. 13 to handle the problem of trivial connected components that arose in lecture yesterday.
    35. Notes on CC lowerbound and the decomposition theorem
    36. Solutions to homework 3.
    37. Notes on Galled-Trees.
    38. Paper on Hapbound and Shrub. Program Shrub is discused in section 4. You will need to read Section 4 to prepare for one of the final exam questions.
    39. The take-home final exam, including the problems 1,2,3 posted earlier. Note that in problem 2, `galled-tree' has now been changed to `reduced galled-tree'.
    40. Solutions to the take-home final exam.