Index | Recent Threads | Unanswered Threads | Who's Active | Guidelines | Search |
World Community Grid Forums
Category: Active Research Forum: Mapping Cancer Markers Forum Thread: Neverending MCM tasks |
No member browsing this thread |
Thread Status: Active Total posts in this thread: 86
|
Author |
|
shanen0
Cruncher Joined: Feb 4, 2021 Post Count: 20 Status: Offline |
Uh... If they check the results on a second machine that happens to experience the same problem as the first machine...
NOT the way to do valid science. Fortunately, I stopped caring a long time ago. Perhaps I should just stop donating the cycles? I probably could have won a Bitcoin lottery by now if I had chosen that scam over the hope of helping research. Oh yeah. Just checked and found two more. I should check the other Windows 10 machines, but why care? (And haven't seen much evidence of concern from the new managers of WCG.) |
||
|
Falconet
Master Cruncher Portugal Joined: Mar 9, 2009 Post Count: 3265 Status: Offline Project Badges: |
If you are still getting issues, then you should stop running MCM until there's a fix.
----------------------------------------I sent them a ticket regarding the Mac issue a while ago but I don't know if they are looking into this. AMD Ryzen 5 1600AF 4C/8T 3.2 GHz - 85W AMD Ryzen 5 2500U 4C/8T 2.0 GHz - 28W Intel Z3740 4C/4T 1.8 GHz - 6W [Edit 2 times, last edit by Falconet at Nov 10, 2021 9:44:42 AM] |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: |
Can one of the users who are experiencing this issue please try the following:
----------------------------------------Let us know what you currently have the setting "Use no more than: X% of processor time" set to. Then change that to 100%. See if that makes the issue go away. Also - please go into log options and select app_msg_send - watch for the issue to re-occur and then post the messages from this log. I am working with a user via the support box where it appears that the suspend and resume messages sent to throttle CPU usage are not being processed fast enough and so the boinc client keeps killing and restarting the research app which causes it to continually restart from the last checkpoint. This only occurs every few minutes but it is enough to prevent it from making progress. [Edit 1 times, last edit by knreed at Nov 10, 2021 2:34:54 PM] |
||
|
Felix Kaeufer
Cruncher Joined: Feb 3, 2012 Post Count: 29 Status: Offline Project Badges: |
No problem. I will try that.
I usually set it to 26 % or 34 %, because I figured out, that these levels result in a high efficiency in terms of the CPU time : run time ratio at low fan speeds. (It seems to almost match 1/n where n is a natural number. 0.13, 0.17, 0.2, 0.5 are other good examples.) |
||
|
Felix Kaeufer
Cruncher Joined: Feb 3, 2012 Post Count: 29 Status: Offline Project Badges: |
[20:31:22] Initializing [20:31:31] Running [20:31:31] EvaluateFitnessOfStartingGeneSignatures 12071 Seems to work! |
||
|
Felix Kaeufer
Cruncher Joined: Feb 3, 2012 Post Count: 29 Status: Offline Project Badges: |
Works. Link to work unit However, I reverted to not fetching MCM work units to be able to run multiple work units at acceptable fan speeds. |
||
|
knreed
Former World Community Grid Tech Joined: Nov 8, 2004 Post Count: 4504 Status: Offline Project Badges: |
@Felix - thanks for trying that change and letting me know that running it at 100% did in fact allow it to run to completion.
I would appreciate it if other users who experience the long running tasks could follow the steps I outlined above so that we could confirm that this is what is going on. Thanks. |
||
|
shanen0
Cruncher Joined: Feb 4, 2021 Post Count: 20 Status: Offline |
Interesting if true. But there are reasons why I don't want to reset this machine to 100%.
----------------------------------------(Just killed 5 more of them. Still, it's nice to hear that someone might be working on the problem and this isn't just an orphan project with orphan software.) [Edit 2 times, last edit by shanen0 at Nov 16, 2021 5:39:12 PM] |
||
|
Felix Kaeufer
Cruncher Joined: Feb 3, 2012 Post Count: 29 Status: Offline Project Badges: |
shanen0, can't you try that for just one work unit on one core like I did? I generally prefer lower settings as well, but knreed needs some help to solve the problem.
|
||
|
Sgt.Joe
Ace Cruncher USA Joined: Jul 4, 2006 Post Count: 7237 Status: Offline Project Badges: |
OK, this may have nothing to do with never ending tasks, but it may have a bearing. I have noticed on one of my machines which runs a mix of MCM and a few OPN that there is a significant difference in the times of the MCM units. In looking at the logs of the valid units there is a difference in the VMethod line. The longer running units have VMethod = NFCV and the shorter units have VMethod = LOO. The NFCV units are about 60% to 100% longer than the LOO units. If, big if, the units which get hung are one or the other, this may provide a clue to where to look for a potential problem. Neither of these units has ever become a never ending unit for me, but just thought I would mention the possibility for others.
----------------------------------------Cheers
Sgt. Joe
*Minnesota Crunchers* |
||
|
|