java代写-Phylogenetic Trees

Background: Phylogenetic Trees

A phylogenetic (or evolutionary) tree represents the evolution of species over time. Each node in the tree corresponds to a species and parent-child relationship represents an evolution from one species to another. Scientists build such trees based on genetic and fossil data, indicating that one species descended from another. In particular, given a DNA samples from a variety of species and a way to determine which DNA sequence is derived from which, a computer can be used to build a phylogenetic tree.

A DNA sequence is composed of four different letters (bases): A, C, G, and T. E.g., AACT, ACGA, GCTAAACG, and TA are all DNA sequences. As species evolve, their DNA sequences change. On Earth, evolution can cause DNA to change in many different ways making it challenging to determine which species descended from which. However, on Pluto, where temperatures are a little lower, DNA is only appended to. I.e., if species X evolves from species Y, then the DNA of species Y is a proper prefix of species X.

Problem: Generating a Phylogenetic Tree

Given a series of DNA sequences (from Pluto) construct a corresponding phylogenetic tree. The tree should then be displayed in manner similar to Assignment 3, as described below. For example, given the input

ACTC AC ACC TAC ACCD TCA T

done

Figure 2: Sample DNA data from Pluto.

The result phylogenetic tree would look like:

*

|-AC

| |-ACC

| | |-ACCD

| |-ACTC

|-T

| |-TAC

| |-TCA

Figure 3: A representation a phylogenetic tree generated from data in Figure 2

Your task is to create a program that generates phylogenetic tree based on DNA samples and outputs a representation of the tree.

Write a program called TreeBuilder.java that reads in DNA sequences from the console (System.in) and outputs the corresponding phylogenetic tree. Your TreeBuilder class must contain the main() method where your program starts running.

Input

Your program should read in the input using a Scanner object, which is instantiated with System.in. The input will contain one or more lines of input. Each line, except the last one will contain a single DNA sequence, comprising four letters (A,C,G,T). The last line will be the word done, indicating that no more input follows.

Hint: All you need to use is the next() method of the Scanner object. Semantics

The DNA sequences can be in any order and will all be unique. The root of the tree is the empty string (“”), which on Pluto is the root of all life. All data will generate a single tree. Each species (except the root) will be evolved from exactly one species, but multiple species may evolve from a single species, as in the example. (This is a simplification.)

Output

Your program should output to System.out. The output should represent the generated phylogenetic tree. Each species should be on separate line. All children of a node in the tree are to be displayed in lexically sorted order. Each species

should be prefixed with d − 1 “| ” followed by “|-” followed by the species name where d is the depth of the node. (See Figure 3.) For the root node, output “*” instead of the empty string.

Example

Sample Input

aaaag aactaac aactaaccgaagc aactaaccgata aacttc aactaaca aacta aactaaccga end

What to Hand In

Sample Input

aaaag aactaac aactaaccgaagc aactaaccgata aacttc aactaaca aacta aactaaccga end

Sample Output

*

|-aaaag

|-aacta

| |-aactaac

| | |-aactaaca

| | |-aactaaccga

| | | |-aactaaccgaagc

| | | |-aactaaccgata

|-aacttc

At least one of the submitted files must be TreeBuilder.java, which is where the main program starts to run. If you have more than one Java file to submit, place them all in a zip file and submit that.

Hints and Suggestions

  • The sample solution uses a 2-pass algorithm. The first pass builds the phylogenetic tree. The second pass recursively outputs the tree. This pass is very similar to the one in the previous assignment.
  • There is not a lot of code to write. The sample solution is under 100 lines of code.