FAH GPU Tracker V2

FAH GPU Tracker V2 is a Folding@Home Client tracking and control program


    Gpu issue on start of new project

    Share

    snowbird48

    Posts : 12
    Join date : 2011-01-20

    Gpu issue on start of new project

    Post by snowbird48 on Sun Jan 30, 2011 3:40 pm

    Hi, I have noticed a bug using GPU3 when one project finishes and another starts. It looks very similar or may be identical to one posted by, "SheepMeister" on Jan.21/2011. This started after I upgraded from v.3.45 to v.3.46. The notification for upgrade came up and I did it while projects were running. Sorry for that. I am sending a bigger log section for your study. Don't know if it matters but I am using EVGA Precision software for the sole purpose of using the software fan control as it is very nice if GPU2 projects are running. Also, when I started folding(not long ago)I put the usual v.6.23 unicore client on. After I installed Tracker I saw no harm in continuing to run it as it seems to run independent of Tracker. Could this cause bugs? It is clearly stated that when running GPU3 Heat Control does not work. I did not turn it off as it is quite possible I could get assigned a GPU2 project and I know how quickly things get hot without Heat Control. Is it okay to leave it on all the time?

    --- Opening Log file [January 30 18:58:35 UTC]


    # Windows GPU Console Edition #################################################
    ###############################################################################

    Folding@Home Client Version 6.30r1

    http://folding.stanford.edu

    ###############################################################################
    ###############################################################################

    Launch directory: C:\Users\Bob\Downloads\FAH GPU Tracker V2\GPU0
    Executable: C:\Users\Bob\Downloads\FAH GPU Tracker V2\FAH_GPU3.exe
    Arguments: -oneunit -forcegpu nvidia_g80 -verbosity 9 -gpu 0

    [18:58:35] - Ask before connecting: No
    [18:58:35] - User name: Orpeus (Team 35947)
    [18:58:35] - User ID: 6E54AA6B1BBFE596
    [18:58:35] - Machine ID: 3
    [18:58:35]
    [18:58:35] Gpu type=2 species=30.
    [18:58:35] Work directory not found. Creating...
    [18:58:35] Could not open work queue, generating new queue...
    [18:58:35] - Preparing to get new work unit...
    [18:58:35] - Autosending finished units... [January 30 18:58:35 UTC]
    [18:58:35] Cleaning up work directory
    [18:58:35] Trying to send all finished work units
    [18:58:35] + No unsent completed units remaining.
    [18:58:35] - Autosend completed
    [18:58:35] + Attempting to get work packet
    [18:58:35] Passkey found
    [18:58:35] - Will indicate memory of 12279 MB
    [18:58:35] Gpu type=2 species=30.
    [18:58:35] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 10, Stepping: 5
    [18:58:35] - Connecting to assignment server
    [18:58:35] Connecting to http://assign-GPU.stanford.edu:8080/
    [18:58:35] Posted data.
    [18:58:35] Initial: 43AB; - Successful: assigned to (171.67.108.31).
    [18:58:35] + News From Folding@Home: Welcome to Folding@Home
    [18:58:35] Loaded queue successfully.
    [18:58:35] Gpu type=2 species=30.
    [18:58:35] Sent data
    [18:58:35] Connecting to http://171.67.108.31:8080/
    [18:58:36] Posted data.
    [18:58:36] Initial: 0000; - Receiving payload (expected size: 43445)
    [18:58:36] Conversation time very short, giving reduced weight in bandwidth avg
    [18:58:36] - Downloaded at ~84 kB/s
    [18:58:36] - Averaged speed for that direction ~84 kB/s
    [18:58:36] + Received work.
    [18:58:36] + Closed connections
    [18:58:36]
    [18:58:36] + Processing work unit
    [18:58:36] Core required: FahCore_15.exe
    [18:58:36] Core found.
    [18:58:36] Working on queue slot 01 [January 30 18:58:36 UTC]
    [18:58:36] + Working ...
    [18:58:36] - Calling '.\FahCore_15.exe -dir work/ -suffix 01 -nice 19 -priority 96 -nocpulock -checkpoint 3 -verbose -lifeline 6456 -version 630'

    [18:58:36]
    [18:58:36] *------------------------------*
    [18:58:36] Folding@Home GPU Core
    [18:58:36] Version 2.15 (Tue Nov 16 08:44:57 PST 2010)
    [18:58:36]
    [18:58:36] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86
    [18:58:36] Build host: amoeba
    [18:58:36] Board Type: NVIDIA/CUDA
    [18:58:36] Core : x=15
    [18:58:36] Window's signal control handler registered.
    [18:58:36] Preparing to commence simulation
    [18:58:36] - Looking at optimizations...
    [18:58:36] DeleteFrameFiles: successfully deleted file=work/wudata_01.ckp
    [18:58:36] - Created dyn
    [18:58:36] - Files status OK
    [18:58:36] sizeof(CORE_PACKET_HDR) = 512 file=<>
    [18:58:36] - Expanded 42933 -> 167707 (decompressed 390.6 percent)
    [18:58:36] Called DecompressByteArray: compressed_data_size=42933 data_size=167707, decompressed_data_size=167707 diff=0
    [18:58:36] - Digital signature verified
    [18:58:36]
    [18:58:36] Project: 11177 (Run 9, Clone 123, Gen 17)
    [18:58:36]
    [18:58:36] Assembly optimizations on if available.
    [18:58:36] Entering M.D.
    [18:58:38] Tpr hash work/wudata_01.tpr: 1345550053 3021858530 2924858552 2268201125 1115307878
    [18:58:38] Working on ALZHEIMER'S DISEASE AMYLOID
    [18:58:38] Client config found, loading data.
    [18:58:39] ***** Got a SIGTERM signal (2)
    [18:58:39] Killing all core threads

    Folding@Home Client Shutdown.
    [18:58:39] Starting GUI Server

    --- Opening Log file [January 30 18:58:39 UTC]


    # Windows GPU Console Edition #################################################
    ###############################################################################

    Folding@Home Client Version 6.30r1

    http://folding.stanford.edu

    ###############################################################################
    ###############################################################################

    Launch directory: C:\Users\Bob\Downloads\FAH GPU Tracker V2\GPU0
    Executable: C:\Users\Bob\Downloads\FAH GPU Tracker V2\FAH_GPU3.exe
    Arguments: -oneunit -forcegpu nvidia_g80 -verbosity 9 -gpu 0

    [18:58:39] - Ask before connecting: No
    [18:58:39] - User name: Orpeus (Team 35947)
    [18:58:39] - User ID: 6E54AA6B1BBFE596
    [18:58:39] - Machine ID: 3
    [18:58:39]
    [18:58:39] Gpu type=2 species=30.
    [18:58:39] Loaded queue successfully.
    [18:58:39]
    [18:58:39] - Autosending finished units... [January 30 18:58:39 UTC]
    [18:58:39] + Processing work unit
    [18:58:39] Trying to send all finished work units
    [18:58:39] Core required: FahCore_15.exe
    [18:58:39] + No unsent completed units remaining.
    [18:58:39] - Autosend completed
    [18:58:39] Core found.
    [18:58:39] Working on queue slot 01 [January 30 18:58:39 UTC]
    [18:58:39] + Working ...
    [18:58:39] - Calling '.\FahCore_15.exe -dir work/ -suffix 01 -nice 19 -priority 96 -nocpulock -checkpoint 3 -verbose -lifeline 6848 -version 630'

    [18:58:39]
    [18:58:39] *------------------------------*
    [18:58:39] Folding@Home GPU Core
    [18:58:39] Version 2.15 (Tue Nov 16 08:44:57 PST 2010)
    [18:58:39]
    [18:58:39] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86
    [18:58:39] Build host: amoeba
    [18:58:39] Board Type: NVIDIA/CUDA
    [18:58:39] Core : x=15
    [18:58:39] Window's signal control handler registered.
    [18:58:39] Preparing to commence simulation
    [18:58:39] - Ensuring status. Please wait.
    [18:58:46]
    [18:58:46] Folding@home Core Shutdown: CLIENT_DIED
    [19:02:05] ***** Got a SIGTERM signal (2)
    [19:02:05] Killing all core threads

    Folding@Home Client Shutdown.


    --- Opening Log file [January 30 19:02:29 UTC]


    # Windows GPU Console Edition #################################################
    ###############################################################################

    Folding@Home Client Version 6.30r1

    http://folding.stanford.edu

    ###############################################################################
    ###############################################################################

    Launch directory: C:\Users\Bob\Downloads\FAH GPU Tracker V2\GPU0
    Executable: C:\Users\Bob\Downloads\FAH GPU Tracker V2\FAH_GPU3.exe
    Arguments: -oneunit -forcegpu nvidia_g80 -verbosity 9 -gpu 0

    [19:02:29] - Ask before connecting: No
    [19:02:29] - User name: Orpeus (Team 35947)
    [19:02:29] - User ID: 6E54AA6B1BBFE596
    [19:02:29] - Machine ID: 3
    [19:02:29]
    [19:02:29] Gpu type=2 species=30.
    [19:02:30] Loaded queue successfully.
    [19:02:30]
    [19:02:30] - Autosending finished units... [January 30 19:02:30 UTC]
    [19:02:30] + Processing work unit
    [19:02:30] Trying to send all finished work units
    [19:02:30] Core required: FahCore_15.exe
    [19:02:30] + No unsent completed units remaining.
    [19:02:30] - Autosend completed
    [19:02:30] Core found.
    [19:02:30] Working on queue slot 01 [January 30 19:02:30 UTC]
    [19:02:30] + Working ...
    [19:02:30] - Calling '.\FahCore_15.exe -dir work/ -suffix 01 -nice 19 -priority 96 -nocpulock -checkpoint 3 -verbose -lifeline 732 -version 630'

    [19:02:30]
    [19:02:30] *------------------------------*
    [19:02:30] Folding@Home GPU Core
    [19:02:30] Version 2.15 (Tue Nov 16 08:44:57 PST 2010)
    [19:02:30]
    [19:02:30] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86
    [19:02:30] Build host: amoeba
    [19:02:30] Board Type: NVIDIA/CUDA
    [19:02:30] Core : x=15
    [19:02:30] Window's signal control handler registered.
    [19:02:30] Preparing to commence simulation
    [19:02:30] - Looking at optimizations...
    [19:02:30] - Files status OK
    [19:02:30] sizeof(CORE_PACKET_HDR) = 512 file=<>
    [19:02:30] - Expanded 42933 -> 167707 (decompressed 390.6 percent)
    [19:02:30] Called DecompressByteArray: compressed_data_size=42933 data_size=167707, decompressed_data_size=167707 diff=0
    [19:02:30] - Digital signature verified
    [19:02:30]
    [19:02:30] Project: 11177 (Run 9, Clone 123, Gen 17)
    [19:02:30]
    [19:02:30] Assembly optimizations on if available.
    [19:02:30] Entering M.D.
    [19:02:32] Will resume from checkpoint file work/wudata_01.ckp
    [19:02:32] Tpr hash work/wudata_01.tpr: 1345550053 3021858530 2924858552 2268201125 1115307878
    [19:02:32] Working on ALZHEIMER'S DISEASE AMYLOID
    [19:02:32] Client config found, loading data.
    [19:02:32] Starting GUI Server
    [19:02:32] Resuming from checkpoint
    [19:02:32] fcCheckPointResume: retreived and current tpr file hash:
    [19:02:32] 0 1345550053 1345550053
    [19:02:32] 1 3021858530 3021858530
    [19:02:32] 2 2924858552 2924858552
    [19:02:32] 3 2268201125 2268201125
    [19:02:32] 4 1115307878 1115307878
    [19:02:32] fcCheckPointResume: file hashes same.
    [19:02:32] fcCheckPointResume: state restored.
    [19:02:32] fcCheckPointResume: name work/wudata_01.log Verified work/wudata_01.log
    [19:02:32] fcCheckPointResume: name work/wudata_01.trr Verified work/wudata_01.trr
    [19:02:32] fcCheckPointResume: name work/wudata_01.xtc Verified work/wudata_01.xtc
    [19:02:32] fcCheckPointResume: name work/wudata_01.edr Verified work/wudata_01.edr
    [19:02:32] fcCheckPointResume: state restored 2
    [19:02:32] Resumed from checkpoint
    [19:02:32] Setting checkpoint frequency: 500000
    [19:02:32] Completed 500001 out of 50000000 steps (1%).
    [19:05:04] Completed 1000000 out of 50000000 steps (2%).
    [19:07:36] Completed 1500000 out of 50000000 steps (3%).
    [19:10:07] Completed 2000000 out of 50000000 steps (4%).
    [19:12:38] Completed 2500000 out of 50000000 steps (5%).
    [19:15:10] Completed 3000000 out of 50000000 steps (6%).
    [19:17:41] Completed 3500000 out of 50000000 steps (7%).
    [19:20:12] Completed 4000000 out of 50000000 steps (8%).
    [19:22:44] Completed 4500000 out of 50000000 steps (9%).
    [19:25:16] Completed 5000000 out of 50000000 steps (10%).
    [19:27:48] Completed 5500000 out of 50000000 steps (11%).
    [19:30:19] Completed 6000000 out of 50000000 steps (12%).
    [19:32:51] Completed 6500000 out of 50000000 steps (13%).
    [19:35:22] Completed 7000000 out of 50000000 steps (14%).
    [19:37:54] Completed 7500000 out of 50000000 steps (15%).
    [19:40:25] Completed 8000000 out of 50000000 steps (16%).
    [19:42:56] Completed 8500000 out of 50000000 steps (17%).
    [19:45:28] Completed 9000000 out of 50000000 steps (18%).
    [19:47:59] Completed 9500000 out of 50000000 steps (19%).
    [19:50:32] Completed 10000000 out of 50000000 steps (20%).
    [19:53:03] Completed 10500000 out of 50000000 steps (21%).
    [19:55:34] Completed 11000000 out of 50000000 steps (22%).
    [19:58:06] Completed 11500000 out of 50000000 steps (23%).
    [20:00:37] Completed 12000000 out of 50000000 steps (24%).
    [20:03:08] Completed 12500000 out of 50000000 steps (25%).
    [20:05:40] Completed 13000000 out of 50000000 steps (26%).
    [20:08:11] Completed 13500000 out of 50000000 steps (27%).
    [20:10:42] Completed 14000000 out of 50000000 steps (28%).
    [20:13:14] Completed 14500000 out of 50000000 steps (29%).
    [20:15:46] Completed 15000000 out of 50000000 steps (30%).
    [20:18:18] Completed 15500000 out of 50000000 steps (31%).
    [20:20:50] Completed 16000000 out of 50000000 steps (32%).
    [20:23:22] Completed 16500000 out of 50000000 steps (33%).
    [20:25:54] Completed 17000000 out of 50000000 steps (34%).
    [20:28:26] Completed 17500000 out of 50000000 steps (35%).
    [20:31:00] Completed 18000000 out of 50000000 steps (36%).
    [20:33:35] Completed 18500000 out of 50000000 steps (37%).
    [20:36:06] Completed 19000000 out of 50000000 steps (38%).
    [20:38:37] Completed 19500000 out of 50000000 steps (39%).
    [20:41:10] Completed 20000000 out of 50000000 steps (40%).
    [20:43:41] Completed 20500000 out of 50000000 steps (41%).
    [20:46:15] Completed 21000000 out of 50000000 steps (42%).
    [20:48:47] Completed 21500000 out of 50000000 steps (43%).
    [20:51:19] Completed 22000000 out of 50000000 steps (44%).
    [20:53:51] Completed 22500000 out of 50000000 steps (45%).
    [20:56:22] Completed 23000000 out of 50000000 steps (46%).
    [20:58:53] Completed 23500000 out of 50000000 steps (47%).
    [21:01:24] Completed 24000000 out of 50000000 steps (48%).
    [21:03:56] Completed 24500000 out of 50000000 steps (49%).
    [21:06:31] Completed 25000000 out of 50000000 steps (50%).
    avatar
    jedi95
    Dev Team Member

    Posts : 307
    Join date : 2010-05-26
    Job/hobbies : FAH GPU Tracker V2 Developer

    Re: Gpu issue on start of new project

    Post by jedi95 on Sun Jan 30, 2011 5:26 pm

    snowbird48 wrote:Hi, I have noticed a bug using GPU3 when one project finishes and another starts. It looks very similar or may be identical to one posted by, "SheepMeister" on Jan.21/2011. This started after I upgraded from v.3.45 to v.3.46. The notification for upgrade came up and I did it while projects were running. Sorry for that. I am sending a bigger log section for your study. Don't know if it matters but I am using EVGA Precision software for the sole purpose of using the software fan control as it is very nice if GPU2 projects are running. Also, when I started folding(not long ago)I put the usual v.6.23 unicore client on. After I installed Tracker I saw no harm in continuing to run it as it seems to run independent of Tracker. Could this cause bugs? It is clearly stated that when running GPU3 Heat Control does not work. I did not turn it off as it is quite possible I could get assigned a GPU2 project and I know how quickly things get hot without Heat Control. Is it okay to leave it on all the time?

    Updating while WUs are running shouldn't cause problems unless the update completes before the FahCore processes exit. This is only a problem for core 15 because the current version has a bug where it can take up to 10 seconds to exit after the client is properly closed. Until this bug is fixed by Stanford you can prevent issues by stopping the GPU clients manually before updating and then waiting a few seconds.

    EVGA Precision or any overclocking/fan control programs won't cause any problems with the Tracker.

    Running the single core CPU client separate from the Tracker won't cause problems as long as you have no machine ID conflicts. The Tracker uses machine IDs 1-10 so it will work fine if you assign it an ID between 11 and 16.

    Using Heat Control on GPU3 projects won't cause any problems, but the settings will simply be ignored by the FahCore for those WUs. This is the reason I have this listed in the known issues section of the readme. You can also set up the Heat Control rules so it only applies settings for GPU2 WUs if you want.


    _________________

    snowbird48

    Posts : 12
    Join date : 2011-01-20

    Thanks

    Post by snowbird48 on Sun Jan 30, 2011 5:54 pm

    Appreciate the info. I will go back to heat control and set for GPU2. There is no machine ID conflict for the v.6.23 client. Should I do a clean install of Tracker 3.46 to eliminate possibility of corruption when I updated while wu's were running? Thanks again.
    avatar
    jedi95
    Dev Team Member

    Posts : 307
    Join date : 2010-05-26
    Job/hobbies : FAH GPU Tracker V2 Developer

    Re: Gpu issue on start of new project

    Post by jedi95 on Sun Jan 30, 2011 7:19 pm

    snowbird48 wrote:Appreciate the info. I will go back to heat control and set for GPU2. There is no machine ID conflict for the v.6.23 client. Should I do a clean install of Tracker 3.46 to eliminate possibility of corruption when I updated while wu's were running? Thanks again.

    Unless you are having problems there is no need to do this.

    Things that could go wrong:
    1. WU gets corrupted and returns as failed (this would show in the Tracker's log so I doubt this is the case for you)
    2. Progress for the current WU fails to report because of duplicate FAHlog.txt files. If the WU progress is changing then this isn't the case.

    Otherwise you should be fine and reinstalling isn't necessary.


    _________________

    Sponsored content

    Re: Gpu issue on start of new project

    Post by Sponsored content


      Current date/time is Fri Jul 21, 2017 1:51 am