Message boards : Questions and problems : Scheduling and CPU use
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 9 Dec 24 Posts: 12
|
I’m having some Boinc issues related to task scheduling and CPU usage. Win 11 2H/25 i9 149000K, RTX 4050 ti, 64gb RAM, running at default BIOS settings, not overclocked. Boinc Mgr 8.2.8 x64 Boinc buda runner wsl installer (26-02-05 download) Boinc settings (current) CPU 55%, 100% of time, suspend 90%, mem 85%, swap 30 mins. BOINC tasks include Einstein, Rosetta, SI Dock, WCG, Dennis, GPU Grid. Also, LCG – not currently using, see comment under CPU utilization. Monitoring tools: Argus Fan controller/monitor (CPU temp), Resource Manager (CPU loads) and Intel’s Extreme Tuning Util [ETU] (CPU load, temp, throttling, memory) 2 issues (these may be related): 1 Task scheduling in Boinc I am getting multiple instances where “Time to finish†vastly exceeds the deadline time (in some cases over 10 days). If I suspend most other tasks, then these tasks finish rapidly. It appears that Boinc is not recognising the issue and swaps these tasks out or reduces their CPU load. This results in having to continually check status and adjust tasks. 2 CPU utilisation a) On a reboot (rarely, runs 24/7) Boinc will show CPU utilisation of 100% on first starting it. Closing and reopening Boinc manager resolves this. b) I am unable to get Resource manager to show more than 3% CPU utilisation for any single task (equivalent to 1 cpu core) c) Total CPU utilisation (including other tasks generally runs about 50% (+/- 10%) except when using specific high load tasks (Image editing). In this case I have now set these as “Exclusive apps†in Boinc. d) Throttling is not occurring. e) If I change % CPU used by Boinc, then all that changes is the number of tasks, not CPU utilization (as reported by resource manager/ETU). f) A specific issue when using LHC is that it takes the CPU usage to 100%. I have set the .wslconfig to: [ws12] memory=4GB processors=1 but this has not changed this. Any ideas/suggestions would be welcome. Richard |
|
Send message Joined: 25 May 09 Posts: 1442
|
I can't answer every one of your questions, but I can suggest what's happening with a few of them.... 1) - Initial estimates of time to finish" are known to be very pessimistic, what you have to do is allow at least a dozen tasks of a given type to run to completion and be validated. This can take a long time of a project is slow in performing the validation (WCG is currently very bad at this). Even taking this into consideration if the tasks delivered by a given project vary in real runtime dramatically the number of completed and validated tasks can balloon dramatically. Also, if you have very large cache of tasks BOINC can really struggle to work out which task is really going to run beyond its deadline and which task will actually finish much quicker than its initial estimate. The solution is to have a small cache, one suggestion is to have "store at least" 1 (or less), and "store up to an additional"set to 0.1 (or less). This means you don't have such a high risk of congestion, which triggers the need to run tasks at high priority. Also, suspending tasks can cause more problems than it solves as some applications roll back processing on a restart..... 2) a) Yes, almost expected as there's a lot of housekeeping preformed when a computer restarts, both by the operating system and by BOINC. b) Many applications are only single threaded, so will only ever use one core, so 3% is about right. c) Expected behaviour when you are only allowing BOINC to use 55% of your CPU cores. As you have found using "exclusive apps" does stop BOINC processing. d) Good news! e) The use x% of CPU tells BOINC it can only use that fraction of the CPU's cores. One problem is that the Windows task manager can get a bit confused..... f) I've no experience of using WSL to run BOINC, but have found it to cause some rather unexpected behaviours when doing other work. |
|
Send message Joined: 9 Dec 24 Posts: 12
|
In reply to robsmith's message of 10 Mar 2026: I can't answer every one of your questions, but I can suggest what's happening with a few of them.... Thanks for the quick response Rob. 1) - Initial estimates of time to finish" are known to be very pessimistic, what you have to do is allow at least a dozen tasks of a given type to run to completion and be validated. I've been running these tasks for years, I should have accumulated enough "Knowledge". Also, if you have very large cache of tasks BOINC can really struggle to work out which task is really going to run beyond its deadline and which task will actually finish much quicker than its initial estimate. The solution is to have a small cache, one suggestion is to have "store at least" 1 (or less), and "store up to an additional"set to 0.1 (or less). This means you don't have such a high risk of congestion, which triggers the need to run tasks at high priority. Cache is set to .25 +.1 day. With this and 55% CPU usage there are several "Waiting to Start" tasks. "Struggle to work out" - I would have though that "deadline - time to end" would give a clear answer? Also, suspending tasks can cause more problems than it solves as some applications roll back processing on a restart..... Oh, that is not good. Exclusive apps does that to all. If I am managing manually, I suspend only those with longer deadlines. 2) It appears that only Bionc does this. I've not let it run for very long, I suspect long enough to settle down, which it does not seem to do. b) Many applications are only single threaded, so will only ever use one core, so 3% is about right. That was my assumption. However allowing many tasks (%CPU near 100) seems to make this issue worse. It does however provide greater throughput. d) Good news! My objective is to get as much done without throttling. e) The use x% of CPU tells BOINC it can only use that fraction of the CPU's cores. One problem is that the Windows task manager can get a bit confused..... I suspected a possible conflict/misunderstanding may be an issue. f) I've no experience of using WSL to run BOINC, but have found it to cause some rather unexpected behaviours when doing other work.[quote] My understanding that is WSL only runs when needed although multiple WSLx tasks are shown, they show as 0 activity. Boinc itself and anything running within it is not using WSL as they are windows apps. It is only tasks from LHC that need WSL. Richard |
|
Send message Joined: 7 Dec 24 Posts: 243 |
In reply to Rich's message of 11 Mar 2026: I've been running these tasks for years, I should have accumulated enough "Knowledge".After 10 Valid results have been returned, when the next Task is downloaded, the Estimated Completion time will update based on the previously returned results. And from then on whenever new work is downloaded, the Estimated Completion time will be adjusted based on the previously reported results. Part (maybe most?) of the problem is probably the BOINC Manager version you are running- BOINC v 8.2.4 broke the updating of the Estimated completion time and left it frozen at the value it was at the time you upgraded. I don't know if the latest version fixed it, but there was no mention of it in the Change logs, and i just haven't had the energy or enthusiasm to try it out to see if it is or isn't still broken. But if the processing time for given application varies considerably depending on the Task, then not having the Estimated completion time updating will cause issues with work scheduling. And the other part of the problem (even if the Estimated Completion time sis updating correctly) is that every time you micro-manage, you undo what the BOINC Manager is trying to achieve- and what it is trying to achieve is to meet you resource share settings, and not miss any deadlines. The more projects you run, the larger the cache, the less compute time, the less the number of cores/threads available, the longer it takes for things to settle down- instead of days or a week or so, it could take 4-8 weeks. And every time you suspend some Task to start other running, you undo it's efforts and it has to start from scratch when you un-suspend. 2 CPU utilisationDon't know why that is. For me on all of my systems, BOINC goes hard upon restart until it can communicate with the rest of itself (ie untill the little Red dot clears up from the icon on the Task Bar. c) Total CPU utilisation (including other tasks generally runs about 50% (+/- 10%) except when using specific high load tasks (Image editing). In this case I have now set these as “Exclusive apps†in Boinc.As you aren't using all your cores threads, unless your image editing software needs more than the other 45% not being used by BOINC, there's no need to use the Exclusive apps option. And the Exclusive apps option will impact on work scheduling, particularly if it occurs occasionally but for significant periods of time when it does come into effect The BOINC manager will assume it has so many cores/threads available, and so many hours per day to process work. Then there's an extended period (or periods) where BOINC is suspended, so BOINC when it can process work again has to juggle what it has, and when work is due, but with much less time than it thought it had to do it. Time goes by without the Exclusive application stopping BOINC processing, so it continues to add to the time it has to process work, and then Exclusive apps comes in to effect again... Rinse and repeat. The smaller the cache, the lesser the impact. But there will always be an impact. f) A specific issue when using LHC is that it takes the CPU usage to 100%. I have set the .wslconfig to:Like Rob, sorry can't help with that. I'd suggest checking out their forums for help with this issue. Grant Darwin NT. |
|
Send message Joined: 25 May 09 Posts: 1442
|
I've just noticed you are running one of Intel's power plus efficiency cores. This really affects the way apps run, and can be a real pain to get BOINC properly optimised. In part this explains why, as you increase the number of cores in use, the actual utilisation drops - the "efficiency" cores struggle, which leads to them going into power limitation states. When you try to run too many tasks you reach a point where the operating system starts temporarily swap in/out so it can perform what it needs to do (it's quite amazing how much modern operating systems do in the background!) - each time this happens there is a very short pause in processing, which looks to most task monitors as a reduction in utilisation. This applies to all processors, and the transition from "running as expected" to "why am I loosing utilisation?" is very sudden, and at what proportion of cores is dependant on exactly what combination of operating system, processor, memory are in use use, to that end it is always worth leaving at least one full core for the operating system to use for its own purposes (it may be more on your heterogenous-core processor). |
DaveSend message Joined: 28 Jun 10 Posts: 3255
|
All I can add is that things are a lot simpler if you restrict yourself to running just one or two projects at a time. Sadly, the adjusting estimated time bug is not fixed even in 8.3.0. (My WCG tasks take about 37 -38 minutes each. Estimated time to completion has remained at around 2 hours 30 minutes over several months and processing well over 12 tasks per day. Even with the horrendous backlog in validation, enough have been validated that it should have changed by now. I will download and compile the latest master though with WCG and the validation issues, it will still be a few weeks before I can tell. |
|
Send message Joined: 5 Oct 06 Posts: 5159
|
There definitely seems to be a problem with updating run time estimates as BOINC 'learns' about changing host and task characteristics. There are three players in this game:
The server and client communicate via sched_request...,xml and sched_reply...xml files. The manager plays no part in estimating runtime, but simply displays the local end of that conversation in a more convenient form for the user. Number of tasks completed 56012 Max tasks per day 56309 Number of tasks today 17 Consecutive valid tasks 56009 Average processing rate 6.21 GFLOPS Average turnaround time 0.47 daysI added that machine in August 2024, and in January 2026, the project in question (Numberfields) moved on to a new set of data, taking significantly longer to crunch each task (contemporaneous comment). In theory, the extra 36% runtime should have been picked up by the server long ago, but as I look at it now, the manager is still showing an estimated runtime of 59 mins 49 secs for each task, as it did before christmas. [it's a nicely memorable number!] So, you'll have to pick your way through all those various stages. Is the server database updating properly? (I think is should be providing the 'average processing rate' for the most recent 100 completed tasks). Is the APR properly copied into the 'sched_reply' when new work is downloaded (I see <flops>6206437894.959294</flops>). Is that figure copied to the proper place in client_state.xml? (I see <flops>8357769989.897738</flops>, which suggests not). Numberfields server is showing software version: 24527, which I think is an outdated numbering scheme: my computer is running Linux Mint v22.2, and the BOINC software (both client and manager) are v8.2.8 from their repository. |
DaveSend message Joined: 28 Jun 10 Posts: 3255
|
The project's BOINC serverThanks for the reminder Richard. While I don't think the issue is limited to WCG (the only project I have running currently) it is entirely possible that with their state of flux, their bit of the equation is not happening at the moment so don't take what is happening on my computer as a guarantee that it is my client's fault! Edit: just found this on Git-Hub from where I raised this. github-project-automation AenBleidd |
Vitalii KoshuraSend message Joined: 29 Mar 17 Posts: 200
|
In reply to Dave's message of 11 Mar 2026:
This just means I have selected this ticket for another round of review. BOINC maintainer. For any insight, check my BOINC Development Blog. |
|
Send message Joined: 9 Dec 24 Posts: 12
|
Thanks for the input. This is a combined response having been through your suggestions. The latest BOINC ver I see is 8.2.8 which is what I am using. I don't see an 8.3 (for Win x64). It seems that: 1) BOINC time estimates are broken. I am not concerned with the initial estimate, only when I see excessive values (17 days recently) for tasks that have been running for some time. 2) BOINC does not play nicely with high-end Intel processors. 3) Suspending may be counterproductive. So setting Exclusive apps may not be a good idea. DXO PhotoLab will grab all the CPU it can get - to 100% on multiple cores. However it only does this when actually processing (outputting), not when setting up parameters for each image. Depending on the number of images and what is being done to them a batch may take 30 sec-5+ mins on this machine. But it may take an hour to do the setup. Similarly using suspend to give more CPU to other tasks may result in the suspended tasks taking significantly longer. Suspending before a reboot (rarely, runs 24/7) seems inevitable. 4) I am currently only getting work from 3 sources, Einstein, WCG and SI Dock so this may not be an issue. 5) Given the CPU scheduling issue (Item 2 above), it may be appropriate to reduce the %CPU utilization (currently 55%, but previously 100% that resulted in many tasks sitting at near-zero utilization) I'll try this and see if it resolves the problem without reducing throughput too much. Thanks, Richard |
|
Send message Joined: 7 Dec 24 Posts: 243 |
In reply to Vitalii Koshura's message of 11 Mar 2026: In reply to Dave's message of 11 Mar 2026:I wish i could find the actual post to refresh my memory- back when the broken Estimated completion time updating was first mentioned, someone posted that it was broken for projects using Credit New. If they didn't use Credit new, it was still updating the Estimated completion time. Edit- Ah, found it. The post was from Keith Myers in the Github bug report. It's not Credit New/not Credit New but using DCF/not using DCF. BOINC does not adjust estimated time based on previous tasks any more #6449 Another report of Remaining (estimated) time not updating on BOINC message boards. Grant Darwin NT. |
|
Send message Joined: 5 Oct 06 Posts: 5159
|
In reply to Rich's message of 11 Mar 2026: The latest BOINC ver I see is 8.2.8 which is what I am using. I don't see an 8.3 (for Win x64).BOINC version numbering: If the middle number is: even - this version is intended for general use odd - the codebase is being prepared for the next release, and can't be trusted for general use. Use only for testing, with care. If you compile your own copy for testing, note that versions like 8.3.0 may not all be the same - the version number won't be changed while a stream of separate fixes is added. |
DaveSend message Joined: 28 Jun 10 Posts: 3255
|
The latest BOINC ver I see is 8.2.8 which is what I am using. I don't see an 8.3 (for Win x64).For 8.3.0 I compile my own from source using the latest master code on Git-Hub. As Richard points out, it not only can but will vary from day to day. |
Copyright © 2026 University of California.
Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License,
Version 1.2 or any later version published by the Free Software Foundation.