PDA

View Full Version : DNA analysis&java




happyElephant
Jan 8, 2007, 07:28 PM
Hello,
sorry for this easy question(I couldn't solve the problem using Java beginner's guide). I want my program to divide whatever number I input by three. However it keeps returning false answers. Unfortunatelly, I can't figure why. Firstly, this is the code:

class AnalyseADN{
public static void main(String args[])
throws java.io.IOException{
int n;

do{
n=(int)System.in.read();
}while(n=='\n'|n=='\r');

System.out.println("Le nombre total de triplets est "+n/3+".");

}
}

Secondly, I would appreciate a hint on how to write a program that displays the number of different DNA triplets from the entire sequence (which goes like this,par ex:LYS VAL PHE GLU ARG CYS GLU LEU VAL...etc.). It should display 7 for the given sequence. Should I start with/try arrays/variable cast/some kind of variable sorting?
Thanks in advance



mbabauer
Jan 8, 2007, 08:08 PM
Hello,
sorry for this easy question(I couldn't solve the problem using Java beginner's guide). I want my program to divide whatever number I input by three. However it keeps returning false answers. Unfortunatelly, I can't figure why.

You are not getting "false" answers, you are just not getting the answer you think you should. The reason has to do with the fact you are using integers, or int, to store the number.

An integer is a whole number, like 1, 221, 5764, etc. This is in contract to a float, which is a decimal like 1.0, 2.333333, 27.678, etc.

When you do math, the computer "Casts" the values to the type you told it to. So, for number results that are decimal in nature, but stored in an int type, the computer will truncate the number prior to storing. Notice I didn't say round the number, but rather truncate. The difference being rounding increases to the next number for decimals > .5 and drops for <=.4, whereas truncation just whacks the decimal off totally.

For instance, the following numbers 3.547, 3.1413, 12.78, and 55.1234 would round to 4, 3,13, and 55, but would be truncated to 3, 3, 12, and 55.

Now, a warning. By using floats you may STILL get answers that are off. This has less to do with truncation and more to do with how a decimal is stored.

Internally, all data in a computer is stored as binary, which is 0's and 1's. There are no decimals, no plus or minus signs, just binary 0's and 1's. At typical int and float value is usually stored internally as a 32-bit number. That means there are 32 slots to put data in.

Without signs and decimals, how is one to represent numbers using just bits. For integers, since they dont have decimals, all that is needed is a means to store the sign. This is done using 2's Complement (http://en.wikipedia.org/wiki/Two's_complement). Decimals, on the other hand, are stored as significant digits with the exponent. I couldn't find a Wikipedia article decribing it, but suficate to say the scheme allows for decimal numbers, but with loss of precision.

Keep in mind these problems are not with Java, but rather computers in general. You will see the SAME problems is just about every computer language out there. If you want to get around these issues, you may want to look into the java.math package, as I believe it has objects that can represent large numbers and more precise decimals.

As to your second question, well, its really hard to answer. Perhaps if you could describe what it is you want to do in a little more detail someone could help.

happyElephant
Jan 8, 2007, 11:23 PM
dear mbabauer thanks a lot for being so explicit in your answer.
If I tell you that in this particular case the input is never decimal(I can't count one amino acid and a half ), but always and int. Furthermore, I insist on getting truncated numbers in an output which should be 3 for ex, if the input is 10 (int 10/3=3). However my program displays 16...Where exactly could the code error be?

lazydog
Jan 9, 2007, 06:27 AM
There are a few things wrong with your program.

You need something like this:-

import java.io.* ;
import java.util.*;

public class AnalyseADN
{

public static void main(String args[]) throws java.io.IOException
{
try
{
BufferedReader in = new BufferedReader ( new InputStreamReader( System.in ) ) ;

while( true )
{
String inStr = in.readLine().trim() ;
int n = Integer.parseInt( inStr.trim() ) ;
System.out.println("Le nombre total de triplets est "+(n/3)+".");
}
}
catch ( Exception e )
{
}

}

}

b e n

lazydog
Jan 9, 2007, 06:33 AM
I should add the reason why you rprogram doesn't work is because:-

1) System.in.read() doesn't do what you think it does. It returns the next byte from the input stream. You need to input a whole lines worth of characters at a time.

2) The where clause is wrong, it should be while( n!='\n' && n!='\r' ).

3) Your loop terminates with 'n' set to whatever character terminated your input, not the preceding number. So it will always give either '\n'/3 or '\r'/3.


To count the number of different triplets you could use a regular expression to find a 3 letter sequence and then put that in a loop to extract each one from the input. You can then track the sequences by using a HashMap. This will let you determine the number of different sequences and also the number of identical sequences.

b e n

happyElephant
Jan 12, 2007, 08:40 AM
I should add the reason why you rprogram doesn't work is because:-

1) System.in.read() doesn't do what you think it does. It returns the next byte from the input stream. You need to input a whole lines worth of characters at a time.

2) The where clause is wrong, it should be while( n!='\n' && n!='\r' ).

3) Your loop terminates with 'n' set to whatever character terminated your input, not the preceding number. So it will always give either '\n'/3 or '\r'/3.


To count the number of different triplets you could use a regular expression to find a 3 letter sequence and then put that in a loop to extract each one from the input. You can then track the sequences by using a HashMap. This will let you determine the number of different sequences and also the number of identical sequences.

b e n

Thanks a lot for your help.

mufflon
Jan 12, 2007, 06:41 PM
(...)
Now, a warning. By using floats you may STILL get answers that are off. This has less to do with truncation and more to do with how a decimal is stored.

Internally, all data in a computer is stored as binary, which is 0's and 1's. There are no decimals, no plus or minus signs, just binary 0's and 1's. At typical int and float value is usually stored internally as a 32-bit number. That means there are 32 slots to put data in.

(...)


well this is not really helpful for the OP, but on a complete sidetrack floating point is based on the following system:

X * B ^ Y (amount of X, Y and the value of B varies based upon implementation)

the B is the "base" - not stored at all, but is more often a predefined variable.

the X part is an integer "with a built in decimal" - a high value here might correspond to "1", this is also using 2's complement to generate +/-.

the Y is the number which B is raised by

Due to the definition of this the floating point number will be very unprecise in some cases (was writing a longer piece until I realised I am unable to really describe it in a structured way):

picture this (purely fictional example!):

if the base is 2, Y is +/- 32 and X is +/- 64 (value is then devided by say 32)
if you want to create a floating point value of say 2 you'll just have:
X= 32, B = 2, Y = 1 (32/32 * 2^1 = 2)

however if you want to describe 2.00123 you'll have a harder time finding anything close enough, sufficiently you will lose precision (so the number might just be 2.0012 instead of 2.00123).

This kind of calculations is also rather though for a computer to calculcate - it will either need a unit handling floating points or convert between float and integer (which it on the other hand is based on) - so more processing cycles but larger numbers and decimals - but not the same precision.

edit: if you didn't follow my thread of reasoning just try wikipedia: clickety (http://en.wikipedia.org/wiki/Floating_point)

lazydog
Jan 13, 2007, 11:08 AM
if you want to create a floating point value of say 2 you'll just have:
X= 32, B = 2, Y = 1 (32/32 * 2^1 = 2)


After reading the wiki link I think I understand what you're trying to say, but shouldn't your example be written as X=64 and Y=0, ie 2 = ( 64/32 ) * 2^0.

b e n