Folding weirdness - multiple proteins per processor

Discussion in 'Distributed Computing' started by CoffeeMonkey, Aug 25, 2004.

  1. CoffeeMonkey macrumors regular

    Joined:
    Feb 23, 2003
    #1
    When I got home from work today, I noticed that my dual G5 is running four FahCore processes - and that each processor is working on two proteins. One of them is working on two instances of the same protein, and the other is working on two different ones.

    is that normal?
     
  2. bousozoku Moderator emeritus

    Joined:
    Jun 25, 2002
    Location:
    Gone but not forgotten.
    #2
    That is not at all normal. I've had a couple of times when two cores were active, but one was working on a protein and the other was dead.
     
  3. CoffeeMonkey thread starter macrumors regular

    Joined:
    Feb 23, 2003
    #3
    Here's some output from 'top' and from 'work' - any ideas?

    PID COMMAND %CPU TIME #TH

    944 FahCore_78 30.2% 2:24:03 3
    943 FahCore_78 25.8% 3:01:37 3
    942 FahCore_78 33.6% 3:45:18 3
    941 FahCore_78 25.8% 6:47:19 3



    $ work
    Start of data

    Process 1
    Protein: p543_BBA5_ext
    Completed 175000 out of 500000 steps (35%)
    Completed 180000 out of 500000 steps (36%)
    Completed 185000 out of 500000 steps (37%)
    Completed 190000 out of 500000 steps (38%)
    Completed 195000 out of 500000 steps (39%)
    Protein: p730_gsgs_sd_h2o
    Completed 3400000 out of 10000000 steps (34%)
    Completed 3500000 out of 10000000 steps (35%)
    Completed 3600000 out of 10000000 steps (36%)
    Completed 3700000 out of 10000000 steps (37%)
    Completed 3800000 out of 10000000 steps (38%)
    -rw-r--r-- 1 pemulis1 pemulis1 293040 25 Aug 17:02 wudata_02.arc
    -rw-r--r-- 1 pemulis1 pemulis1 439320 25 Aug 20:55 wudata_02CP.arc
    -rw-r--r-- 1 pemulis1 pemulis1 421200 25 Aug 21:07 wudata_03.arc
    -rw-r--r-- 1 pemulis1 pemulis1 7860 25 Aug 21:07 wudata_03CP.arc

    Process 2
    Protein: p543_BBA5_ext
    Completed 20000 out of 500000 steps (4%)
    Completed 25000 out of 500000 steps (5%)
    Completed 30000 out of 500000 steps (6%)
    Completed 35000 out of 500000 steps (7%)
    Completed 40000 out of 500000 steps (8%)
    Protein: p543_BBA5_ext
    Completed 340000 out of 500000 steps (68%)
    Completed 345000 out of 500000 steps (69%)
    Completed 350000 out of 500000 steps (70%)
    Completed 355000 out of 500000 steps (71%)
    Completed 360000 out of 500000 steps (72%)
    -rw-r--r-- 1 pemulis1 pemulis1 146520 25 Aug 14:54 wudata_02.arc
    -rw-r--r-- 1 pemulis1 pemulis1 439320 25 Aug 20:53 wudata_02CP.arc
    -rw-r--r-- 1 pemulis1 pemulis1 586080 25 Aug 18:35 wudata_03.arc
    -rw-r--r-- 1 pemulis1 pemulis1 439320 25 Aug 21:05 wudata_03CP.arc
     
  4. stoid macrumors 601

    stoid

    Joined:
    Feb 17, 2002
    Location:
    So long, and thanks for all the fish!
    #4
    it looks like you've somehow got mc68k's scripts running two separate instances. Look in your log-in items and make sure that there is only one launch of the scripts in there. Log-out and log back in, maybe that'll fix it.
     
  5. bousozoku Moderator emeritus

    Joined:
    Jun 25, 2002
    Location:
    Gone but not forgotten.
    #5
    That's just scary. I can't imagine how you've got things configured to allow more than one active work unit in a queue.dat file. The code I wrote for work is not meant to work over four directories, only two, but looking at work's output, it's obvious that all four work units are being updated. It's also obvious that not much is getting done quickly since you're dividing the time.

    My instinctual reaction would be to wait until the one at 73 percent finishes and flush everything else and start over but that's not necessarily what you might want.
     
  6. ChrisFromCanada macrumors 65816

    ChrisFromCanada

    Joined:
    May 3, 2004
    Location:
    Hamilton, Ontario (CANADA)
    #6
    I have had this problem before while running the mc68k dual script and have heard of others with this problem too. The only solution seems to be to wait for the extra one(s) to finish and hope it doesn't happen again.
     
  7. Dreadnought macrumors 68020

    Dreadnought

    Joined:
    Jul 22, 2002
    Location:
    Almere, The Netherlands
    #7
    Do you also have 4 folding folders? Did you use in terminal the command work 1 or work 2 several times?
     
  8. RugoseCone macrumors 6502

    Joined:
    Aug 22, 2002
    #8
    Similar issues?

    My production has really kind of tanked. Using the "work" command in terminal I see this...

    Process 1
    Protein: p224_NTL91Murea
    Protein: p1256_A7ext_298K_96
    Completed 900000 out of 5000000 steps (18%)
    Completed 950000 out of 5000000 steps (19%)
    Completed 1000000 out of 5000000 steps (20%)
    Completed 1050000 out of 5000000 steps (21%)
    Completed 1100000 out of 5000000 steps (22%)
    Protein: p214_villin4Mure


    Process 2
    Protein: p216_villin0Murea
    Protein: p1269_A7hel_298K_99ps
    Completed 900000 out of 5000000 steps (18%)
    Completed 950000 out of 5000000 steps (19%)
    Completed 1000000 out of 5000000 steps (20%)
    Completed 1050000 out of 5000000 steps (21%)
    Completed 1100000 out of 5000000 steps (22%)


    Has anybody else ever seen multiple proteins listed like this? I never have until yesterday. I can't find any evidence that it is trying to process all five, but maybe it is? Any suggestions or comments?
     
  9. bousozoku Moderator emeritus

    Joined:
    Jun 25, 2002
    Location:
    Gone but not forgotten.
    #9
    You merely have extra logfile_??.txt files that couldn't be deleted when the results were uploaded.
     
  10. RugoseCone macrumors 6502

    Joined:
    Aug 22, 2002
    #10

    Ah! Thank you. When I saw this thread it raised my concerns with this, since I had never seen it before.
     
  11. bousozoku Moderator emeritus

    Joined:
    Jun 25, 2002
    Location:
    Gone but not forgotten.
    #11
    You'll still have to go into the work folder and delete the ones which non-matching numbers, but you'll see pretty quickly what doesn't belong.

    I wrote the code so that it would tell me everything that was out there, regardless, since I kept finding things that weren't supposed to be there. :) I suppose I could have written an automatic clean up for the extra logfiles, but it never seemed worth the effort.
     
  12. RugoseCone macrumors 6502

    Joined:
    Aug 22, 2002
    #12
    All cleaned up. At first when I opened the work folders I wasn't sure which one to get rid of. Never looked in there before.

    But like you said, it was pretty clear what didn't belong after reading through all the filenames. Thanks for all the help. Now if I could just get this machine to fold faster...
     
  13. bousozoku Moderator emeritus

    Joined:
    Jun 25, 2002
    Location:
    Gone but not forgotten.
    #13
    You're welcome and I'm glad you realised what was what. I've had to explain it in the past. ;)

    There are some things you can do to make folding faster but none seem really effective to me, other than leaving the machine alone.
     
  14. RugoseCone macrumors 6502

    Joined:
    Aug 22, 2002
    #14
    Well I certainly don't want to tinker. No pun intended. I just wish I'd get back to my daily production averages of the last several weeks. It's kind of hard to see your points drop and have someone that was 500 points behind you (that you thought was never coming back) rocket past.

    Ah well, I'm getting less interested in the points and moving upwards at this point.

    Again, thanks for clearing things up. With my recent slowdown, the undeleted log files, and this thread; I really thought something had gone horribly awry.
     
  15. bousozoku Moderator emeritus

    Joined:
    Jun 25, 2002
    Location:
    Gone but not forgotten.
    #15
    Does any of it really matter? I was way up on the list at one point in time and now, I'm falling a little every week. It's not that my numbers have gone down, just that people with multiple machines have come along. Oh well.
     

Share This Page