Mini-Project Guidelines:
Everyone will present their mini-project results during our extended
final-exam period on 12/12 from noon-5pm. The order of presentations
will be random. You will be alloted 15 mins to show your results, and
explain novel aspects of your approach. You may use either a PowerPoint
presentation (no more than 10 slides) or a web page to demonstrate your
results. Your presentation will account for 50% of your mini-project
grade.
In addition, you are required to turn in a short writeup of your
project that concentrates on describing novel aspects of your approach,
and your results. The write-up should be approximately one page in length
and should not exceed two pages (pdf format is preferred). You should turn
in this writeup, all of your mini-project code, and a separate README file
explaining how to run your code, as a zip archive, which must be emailed
to mcmillan@cs.unc.edu before midnight on 12/12.
All projects must be written in Python, and should be individual
efforts. The last three projects are associated with potential RA opportunities,
if that is of interest.
Project #1: Eigenfaces
This project involves performing PCA on images of your classmates.
PCA is generally applied to a set of points with common features.
In order to align this data set's features you should first rectify
each image. This is best accomplished using an affine transform to
align the two eyes and mouth of each image, followed by cropping them
to a common image size. A good reconstruction filter will minimize the
quality loss at this stage.
Once rectified, perform PCA on the resulting set of images treating
each as a single point. Find the number of factors necessary to capture
95% of the data set's total variance, and compute the scaling weights
for each input onto this set of factors (i.e. the projection of each
point onto the PCA factor). Compute the Mean Absolute Error (MAE) for
each point, and keep track of which one (who) is the largest outlier.
Also, generate images of the mean point, and the first two PCA
factors.
Hints and Options:
- Find headshot images of two celebrities of your own choosing.
Rectify them, and compute their projections onto the Eigenvectors of
the class factors you computed previously and their reconstruction
in 665 space (You need not include them in the PCA analysis).
Compute their MAE as well.
- Construct a plot of all samples projected onto the first
two Eigenvectors. Label each point, and identify the closest pair of
points with a line segment.
Project #2: Tubule Filter
The following images are of testicular histology studies. The nearly
round substructures packed within the larger oval are cross-sections of
seminiferus tubules. The goal of this project is design a filter, or combination
of filters, whose response is maximal when centered over a tubule. The tricky
part of this problem is that your filter should work for any image scale/resolution.
We are only interested in detecting tubules that are near circular in
their presentation (the oblong and twisted compartments can be safely ignored). Missing
a few tubules (false negatives) are preferred over incorrect detections
(false positives). In particular, you should strive not to mislabel any of
the gaps (white regions) between tubules.
Feel free to use any of the tools discussed in class, in particular spatial
convolution, Fourier analysis, and wavelet analysis may prove useful.
Use the color channels to visualize your filter's response. For example,
you might display all regions exceeding a given response threshold as an intensity
in the red channel. In any case, you must provide some visualization of your
results overlaid on the given images.
Hints and Options:
- You can preprocess the original image in various ways prior to
applying your filter. For example you might want to adjust/normalize
the intensity levels, prior to application of your filter.
- Use different colors to indicate the confidence of your detection.
- Automatically fill in each tubule, indicating it with a color tint.
- Count the number of tubules detected, and the ratio of their area to the
area of the entire oval.
Project #3: Motility Movie Analysis
This project analyzes the motility of spermatozoa using 90 movie frames
captured from a light microscope. The first objective is to separate the moving
objects from the fixed background clutter. This requires extracting a background
matte image, and then comparing each movie frame against it. Either an temporal
average or temporal median image can be used.
The next requirement is to construct a summary image, which integrates
the moving elements along their motion paths, with the previously computed
background removed. The motion path should be pseudo-colored by tinting
the elements of the first frame blue with a smooth transition to shades
of red in the last frame.
Hints and Options:
- Extend your code to individually track each moving item separately.
You should report the trajectory as a 4-tuple of (item_number, frame, x, y),
and you can visualize your result by assigning a different color to
each item.
- Reintroduce the background back into your animation by tinting it green
and estimating a per-pixel alpha/opacity value for the animation (Note
that a median background matte might provide better results for this
option). Then overlay the animation over the background. Experiment with
other backgrounds.
Project #4: Visualizing Sequence Similarity
The genome (DNA) sequences of individuals from the same species are largely
identical, with the exception of a few polymorphic positions called SNPs
(for Single Nucleotide Polymorphisms). The subset of variable positions
in a genomic sequence are called the organism's haplotype.
A useful pairwise measure of local sequence dissimilarity is the number
of differing positions within a specified genomic window. Haplotype sequences
are ideal for make such sequence comparisons.
The object of this project is to create an animation to visualize the
sequence similarities of several common lab-mouse strains. Each frame of
the visualization should use MDS to find the 2D coordinates of each strain
which best represents pairwise sequence differences as the Euclidean distance
between the pair. The animation is comprised of all frames as a window slides
over a 10 megabase window as it is translated 1 megabase per frame.
For each frame you will need to first find the set of SNPs that fall within
the current interval. For this set of SNPs you will need to compare all pairs
of strains, and find the number of sequence mismatches between strains.
These differences will be the entires of your dissimilarity matrix to which
MDS will be applied. Once applied, the initial 2D coordinates of the points
will be corresponding values from the first two Eigenvectors.
| Position |
| 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 | 7 |
| 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
| 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | 3 |
| 6 | 6 | 6 | 6 | 7 | 7 | 7 | 7 | 9 | 9 | 9 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 3 | 3 | 3 | 3 | 9 | 9 | 9 | 9 | 9 | 9 |
| 2 | 3 | 4 | 6 | 0 | 1 | 2 | 5 | 6 | 7 | 8 | 1 | 1 | 2 | 3 | 3 | 4 | 8 | 9 | 9 | 1 | 2 | 2 | 2 | 3 | 3 | 3 | 4 | 4 | 4 | 5 | 5 | 5 | 6 | 6 | 7 | 7 | 7 | 8 | 9 | 0 | 0 | 1 | 1 | 8 | 8 | 8 | 9 | 9 | 9 |
| 5 | 4 | 5 | 1 | 9 | 5 | 9 | 6 | 6 | 8 | 0 | 6 | 7 | 5 | 1 | 3 | 4 | 9 | 1 | 7 | 7 | 1 | 6 | 9 | 1 | 4 | 8 | 0 | 5 | 8 | 0 | 2 | 9 | 5 | 8 | 1 | 3 | 5 | 8 | 4 | 3 | 9 | 3 | 7 | 0 | 3 | 9 | 3 | 4 | 7 |
| Strain | 0 | 5 | 5 | 5 | 7 | 7 | 4 | 5 | 3 | 9 | 3 | 9 | 4 | 5 | 9 | 5 | 7 | 4 | 0 | 3 | 9 | 0 | 7 | 7 | 0 | 1 | 7 | 9 | 6 | 3 | 4 | 4 | 1 | 4 | 2 | 7 | 3 | 7 | 2 | 2 | 3 | 3 | 0 | 9 | 3 | 4 | 1 | 7 | 3 | 1 |
| 129S1/SvImJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | A | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| 129S4/SvJae | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| 129X1/SvJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | A | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| A/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | T | T | G | T |
| AKR/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | A | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| BALB/cByJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | T | T | G | T |
| BALB/cJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| BPH/2J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| BPL/1J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| BPN/3J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| BTBR T+ tf/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | A | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| BUB/BnJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| C3H/HeJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| C57BL/10J | T | C | A | G | G | T | C | A | T | A | A | A | A | G | C | T | C | A | T | G | G | A | A | G | G | T | A | T | G | C | C | T | T | A | G | A | T | A | A | A | T | T | G | G | A | A | G | C | G | G |
| C57BL/6J | T | C | A | G | G | T | C | A | T | A | A | A | G | G | C | T | C | A | T | G | G | G | A | G | T | G | T | C | A | T | T | C | C | A | G | G | G | G | G | A | T | A | T | A | A | A | G | T | G | G |
| C57BLKS/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| C57BR/cdJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| C57L/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| C58/J | T | C | A | G | G | T | C | A | T | A | A | A | A | G | C | T | C | A | T | G | G | A | A | G | G | T | A | T | G | C | C | T | T | A | G | A | T | A | A | A | T | T | G | G | A | A | G | C | G | G |
| CALB/RK | A | T | G | A | T | C | A | G | T | T | G | T | G | A | T | G | T | G | A | A | T | A | A | G | G | T | A | T | G | C | C | T | T | A | G | A | T | A | A | A | T | T | G | G | A | A | G | C | G | G |
| CAST/EiJ | A | T | G | A | T | C | A | G | C | T | G | T | G | A | T | T | T | G | A | A | T | A | A | G | G | T | A | T | A | C | C | T | C | A | G | A | T | A | A | A | T | T | G | G | A | A | G | C | G | G |
| CBA/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | G | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | C | T | T | A | A | C | G | T | A | T |
| CE/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| CZECHII/EiJ | T | C | A | G | G | C | C | G | C | A | A | A | A | G | C | T | C | A | T | G | G | G | A | G | G | T | A | T | A | C | C | T | C | G | A | A | T | A | A | C | T | T | G | G | A | A | G | C | G | G |
| DBA/1J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | G | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | C | T | T | A | A | C | G | T | A | T |
| DBA/2J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | G | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | C | T | T | A | A | C | G | T | A | T |
| DDK/Pas | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| DDY/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| EL/SUZ_2 | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| FVB/NJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | T | T | G | T |
| HTG/GOSFSN | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| I/LnJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| IL/ILS | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| IS/CAMRK | A | T | G | A | T | C | A | G | T | T | G | T | G | A | T | G | T | G | A | A | T | A | A | G | G | T | A | T | G | C | C | T | T | A | G | A | T | A | A | A | T | T | G | G | A | A | G | C | G | G |
| IS/ISS | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| JF1/Ms | T | C | A | G | G | C | C | G | C | A | A | A | A | G | C | T | C | A | T | G | G | G | A | G | G | T | A | T | A | C | C | T | C | G | A | A | T | A | A | C | T | T | G | G | A | A | G | C | G | G |
| KK/HLJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | T | T | G | T |
| LEWES/EI | A | T | G | A | T | C | A | G | T | T | G | T | G | A | T | G | T | G | A | A | T | A | A | G | G | T | A | T | G | C | C | T | T | A | G | A | T | A | A | A | T | T | G | G | A | A | G | C | G | G |
| LG/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | G | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | C | T | T | A | A | C | G | T | A | T |
| LP/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| MA/MyJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| MAI/Pas | T | C | A | G | G | C | C | G | C | A | A | A | A | G | C | T | C | A | T | G | G | G | A | G | G | T | A | T | A | C | C | T | C | G | A | A | T | A | A | C | T | T | G | G | A | A | G | C | G | G |
| MOLF/EiJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | A | A | T | T | G | T |
| MOLG/DN | T | C | A | G | G | C | C | G | C | A | A | A | A | G | C | T | C | A | T | G | G | G | A | G | G | T | A | T | A | C | C | T | C | G | A | A | T | A | A | C | T | T | G | G | A | A | G | C | G | G |
| MRL/MpJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | G | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | C | T | T | A | A | C | G | T | A | T |
| MSM/Ms | T | C | A | G | G | C | C | G | C | A | A | A | A | G | C | T | C | A | T | G | G | G | A | G | G | T | A | T | A | C | C | T | C | G | A | A | T | A | A | C | T | T | G | G | A | A | G | C | G | G |
| NOD/LtJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| NON/LtJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| NOR/LTJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| NZB/BlNJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| NZO/HlLtJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| NZW/LacJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| O2/O20 | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| P/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| PERA/EiJ | A | T | G | A | T | C | A | G | T | T | G | T | G | A | T | G | T | G | A | A | T | A | A | G | G | T | A | T | G | C | C | T | T | A | G | A | T | A | A | A | T | T | G | G | A | A | G | C | G | G |
| PERC/EI | A | T | G | A | T | C | A | G | T | T | G | T | G | A | T | G | T | G | A | A | T | A | A | G | G | T | A | T | G | C | C | T | T | A | G | A | T | A | A | A | T | T | G | G | A | A | G | C | G | G |
| PL/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| PWD/Ph | A | T | G | A | T | C | A | G | T | T | G | T | G | A | T | G | T | G | A | A | T | A | A | G | G | T | A | T | G | C | C | T | T | A | G | A | T | A | A | A | T | T | G | G | A | A | G | C | G | G |
| PWK/PhJ | T | C | A | G | G | C | C | G | C | A | A | A | A | G | C | T | C | A | T | G | G | G | A | G | G | T | A | T | A | C | C | T | C | G | A | A | T | A | A | C | T | T | G | G | A | A | G | C | G | G |
| Qsi/Qsi5 | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| RF/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | G | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | C | T | T | A | A | C | G | T | A | T |
| RIIIS/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| SEA/GnJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| SEG/Pas | T | C | A | G | G | C | C | G | C | A | A | A | A | G | C | T | C | A | T | G | G | G | A | G | G | T | A | T | A | C | C | T | C | A | A | A | T | A | A | C | T | T | G | G | A | A | G | C | G | G |
| SJL/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| SKIVE/EI | T | C | A | G | G | C | C | G | C | A | A | A | A | G | C | T | C | A | T | G | G | G | A | G | G | T | A | T | A | C | C | T | C | A | A | A | T | A | A | C | T | T | G | G | A | A | G | C | G | G |
| SM/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| SOD1/EI | T | C | A | G | G | T | C | A | T | A | A | A | A | G | C | T | C | A | T | G | G | A | A | G | G | T | A | T | G | C | C | T | T | A | G | A | T | A | A | A | T | T | G | G | A | A | G | C | G | G |
| SPRET/EiJ | T | C | A | G | G | C | C | G | C | A | A | A | A | G | C | T | C | A | T | G | G | G | A | G | G | T | A | T | A | C | C | T | C | G | A | A | T | A | A | C | T | T | G | G | A | A | G | C | G | G |
| ST/bJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| SWR/J | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| TALLYHO/JNGJ | T | C | A | G | G | C | C | A | C | A | A | A | G | G | C | T | C | A | T | G | G | G | A | A | T | G | T | C | A | T | T | C | C | G | G | G | G | G | G | A | T | T | T | A | G | A | G | T | G | T |
| WSB/EiJ | T | C | A | G | G | T | C | A | T | A | A | A | A | G | C | T | C | A | T | G | G | G | A | G | T | G | T | C | G | T | T | C | C | A | G | G | G | G | G | C | T | A | T | A | A | A | G | T | G | G |
| ZALENDE/EiJ | A | T | G | A | T | C | A | G | T | T | G | T | G | A | T | G | T | G | A | A | T | A | A | G | G | T | A | T | G | C | C | T | T | A | G | A | T | A | A | A | T | T | G | G | A | A | G | C | G | G |
Shown above is a sample of a haplotype sequence. You can download a full sequence here.
It is a zipped text file (.cvs format, which can be loaded into Excel). It is also transposed relative to the table shown above.
In order to maintain the coherence of your animation, it will be necessary
to consistently align each frame of the animation. Use the following
alignment approach, translate the centroid of the strains named "CAST/Eij",
"WSB/Eij" and "PWD/Ph" too the center of your image. Rotate the result so
that CAST/Eij is offset entirely in the positive x-direction. You
may then need to optionally, reflect all of the remaining coordinates
so that the y-coordinate of PWD is positive (this involves only multiplying
the y-values of all strains by -1). Finally, you should scale the entire image
so that all strains, and their labels, fit within the image.
Hints and Options:
- You can connect your strain points into a tree that is constructed as follows.
For each dissimilarity matrix, find the smallest off-diagonal value, which will
correspond to the closest pair of strains. Connect these two strains with a line
segment in your visualization. Next remove these two strains from your table, and
replace them with a new "virtual" strain, whose distance to other strains is the
average of the two repalced distances. The coordinate of this virtual strain should
be the midpoint of the line segment (do not run MDS again). Repeat this
procedure until only two strains are left, and then connect them with the final
segment.
- Use the LBG algorithm to indentify 3-4 clusters (codebook entries), assign
a distinct color to points in each cluster. In all likelihood you will not end
up with consistent colors if you do apply this to every frame, thus, you should
experiemnt with changing it over 10 frames or so. You should try as much as possible
to keep your colors consitent between frames, seeding the LBG centriod point with
the mean of the clusters from the previous LBG run will aid in this.
|