Message boards :
Number crunching :
Good news for Mac users. HadAM3P Latest News???
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
Two more tasks (using v 6.08 for Mac) completed just before 8pm UTC today, 3rd August. Both registered 72,000 in Task Details page and in Trickles Info page. It is strange that both apparently completed Post Processing successfully and in both tasks, the 3 zip files were uploaded, but both tasks sat in the Tasks page flagged as Ready to Report 100% (and the next 2 tasks were already running as would be expected). Normally I would have expected these completed tasks to have been reported, but they had to be pushed by an update from the Project page. I would not have thought that the delay in credits on trickles would have delayed the final display of 72,096 Time Steps, neither should the graphics display be still missing from the Task Details pages on both tasks 9268429 and 9268416 . Maybe, maybe it will update tomorrow, but I cannot see why anything other than the credits granted should be delayed. Keith |
Send message Joined: 5 Sep 04 Posts: 7629 Credit: 24,240,330 RAC: 0 |
The post here by Ageless lists the conditions that will trigger a "Report". In the case of this project, the most common trigger is the next trickle_up, which will be from the next model started. It's now suspected that there may be a clash between Post Processing, and the generation of the last trickle. This will need to be investigated. In the meantime, I'd suggest running other model types. The slab models are fairly short. Backups: Here |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
The post here by Ageless lists the conditions that will trigger a "Report". Les Both tasks now show eror as follows:- <core_client_version>6.6.36</core_client_version> <![CDATA[ <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 20:09:46 (12434): called boinc_finish </stderr_txt> ]]> Two trickle up messages were made an hour ago from the next 2 tasks, but, no doubt too late for the 72,096 to be amended on the relevant pages? This seems to require an amendment to the script even if I was "wrong" to use the Update command for the project? Next 2 tasks will be left without using update to await next tricle as you suggest may complete the tasks and produce the elusive 72,096 report. But maybe all will work normally when the trickle credits are also working correctly. In another week I will have details from next 2 tasks to report on this thread again!!! Keith |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
As far as I could see all the Trickles were "triggered" as would be expected. Only the reporting of the 2 tasks was missed although it did appear to be effected after the Update. Here is the message page for the period of the last trickles and uploads, followed by the Update and also trickles of the following two tasks. Maybe it will be of some help, I hope:- Mon 3 Aug 19:59:34 2009 climateprediction.net Sending scheduler request: To send trickle-up message. Mon 3 Aug 19:59:34 2009 climateprediction.net Not reporting or requesting tasks Mon 3 Aug 19:59:48 2009 climateprediction.net Computation for task hadam3p_nh59_1961_2_006240311_0 finished Mon 3 Aug 19:59:49 2009 climateprediction.net Starting hadam3p_naj9_1990_2_006231743_1 Mon 3 Aug 19:59:49 2009 climateprediction.net Starting task hadam3p_naj9_1990_2_006231743_1 using hadam3p version 608 Mon 3 Aug 19:59:51 2009 climateprediction.net Started upload of hadam3p_nh59_1961_2_006240311_0_1.zip Mon 3 Aug 19:59:51 2009 climateprediction.net Started upload of hadam3p_nh59_1961_2_006240311_0_2.zip Mon 3 Aug 19:59:55 2009 climateprediction.net Scheduler request completed Mon 3 Aug 20:09:31 2009 climateprediction.net Sending scheduler request: To send trickle-up message. Mon 3 Aug 20:09:31 2009 climateprediction.net Not reporting or requesting tasks Mon 3 Aug 20:09:46 2009 climateprediction.net Scheduler request completed Mon 3 Aug 20:09:48 2009 climateprediction.net Computation for task hadam3p_nh5f_1997_2_006240317_1 finished Mon 3 Aug 20:09:49 2009 climateprediction.net Starting hadam3p_naxg_1966_2_006232254_2 Mon 3 Aug 20:09:49 2009 climateprediction.net Starting task hadam3p_naxg_1966_2_006232254_2 using hadam3p version 608 Mon 3 Aug 20:14:11 2009 climateprediction.net Finished upload of hadam3p_nh59_1961_2_006240311_0_2.zip Mon 3 Aug 20:14:11 2009 climateprediction.net Started upload of hadam3p_nh59_1961_2_006240311_0_3.zip Mon 3 Aug 20:14:50 2009 climateprediction.net Finished upload of hadam3p_nh59_1961_2_006240311_0_3.zip Mon 3 Aug 20:14:50 2009 climateprediction.net Started upload of hadam3p_nh5f_1997_2_006240317_1_1.zip Mon 3 Aug 20:17:42 2009 climateprediction.net Finished upload of hadam3p_nh59_1961_2_006240311_0_1.zip Mon 3 Aug 20:17:42 2009 climateprediction.net Started upload of hadam3p_nh5f_1997_2_006240317_1_2.zip Mon 3 Aug 20:32:17 2009 climateprediction.net Finished upload of hadam3p_nh5f_1997_2_006240317_1_1.zip Mon 3 Aug 20:32:17 2009 climateprediction.net Started upload of hadam3p_nh5f_1997_2_006240317_1_3.zip Mon 3 Aug 20:32:43 2009 climateprediction.net Finished upload of hadam3p_nh5f_1997_2_006240317_1_3.zip Mon 3 Aug 20:33:06 2009 climateprediction.net Finished upload of hadam3p_nh5f_1997_2_006240317_1_2.zip Mon 3 Aug 20:57:52 2009 climateprediction.net update requested by user Mon 3 Aug 20:57:54 2009 climateprediction.net Sending scheduler request: Requested by user. Mon 3 Aug 20:57:54 2009 climateprediction.net Reporting 2 completed tasks, not requesting new tasks Mon 3 Aug 20:57:59 2009 climateprediction.net Scheduler request completed Tue 4 Aug 01:47:36 2009 climateprediction.net Sending scheduler request: To send trickle-up message. Tue 4 Aug 01:47:36 2009 climateprediction.net Not reporting or requesting tasks Tue 4 Aug 01:47:41 2009 climateprediction.net Scheduler request completed Tue 4 Aug 01:48:32 2009 climateprediction.net Sending scheduler request: To send trickle-up message. Tue 4 Aug 01:48:32 2009 climateprediction.net Not reporting or requesting tasks Tue 4 Aug 01:48:37 2009 climateprediction.net Scheduler request completed |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
REFERRING TO THE PREVIOUS 2 MESSAGES I HAVE SENT, YOU WILL NOTICE THAT THE ERROR MESSAGE SHOWS THAT AT 20:09:46 THE BOINC QUIT REQUEST WAS SENT AND ACTIONED. THAT WAS BEFORE THE UPLOADS OF THE ZIP FILES, WHICH SURELY MUST BE WRONG. KEITH |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
REFERRING TO THE PREVIOUS 2 MESSAGES I HAVE SENT, YOU WILL NOTICE THAT THE ERROR MESSAGE SHOWS THAT AT 20:09:46 THE BOINC QUIT REQUEST WAS SENT AND ACTIONED. The uploading of the Zip files is a separate process to model computation. If network activity is off, for example, the Zip files will be kept until communication is possible again - that's a BOINC design feature. So it doesn't matter if uploading takes place after the "Computation for task X finished" message. |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
I note that Mac 823790 is still getting tasks completed at 72,000 instead of 72,096. Using version 6.08. Keith |
Send message Joined: 1 Jan 07 Posts: 943 Credit: 34,412,615 RAC: 5,040 |
I note that Mac 823790 is still getting tasks completed at 72,000 instead of 72,096. But no longer displaying Unable to load library hadam3p_se_6.07_i686-apple-darwin.dylib in the stderr_out. So it seems to be suffering the new version of the problem (cross-platform), not the old version (Mac and Linux only). |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
That last error message was from a task running v6.07, which is the problem version. It is the most recent task that was run by v 6.08 that still has the problem of finishing at 72,000, which is the very thing it was written to cure!!! It shows error message:- <core_client_version>5.10.32</core_client_version> <![CDATA[ <stderr_txt> 19:49:20 (28555): called boinc_finish </stderr_txt> ]]> With no reference to a missing library. Keith |
Send message Joined: 1 Jan 07 Posts: 943 Credit: 34,412,615 RAC: 5,040 |
Please read my post more carefully, and see my PM reply to your PM. |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
Please read my post more carefully, and see my PM reply to your PM. Yes. OK Richard. My apologies. Keith |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
Now that the 5% credit inflation seems to have been sorted out. The problem of 72,096 time steps not being completed along with the post processing report and graphs not being included on the result. I have 2 HADAM3P tasks just completed. One is hadam3p_n0b3_1971_2_006218489_0 It has been uploadied and it\'s _2.zip file has been transferred, It\'s stderr out message shows as:-- ============= <core_client_version>6.6.36</core_client_version> <![CDATA[ <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 22:44:23 (68933): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No \'heartbeat\' from BOINC... CPDN Monitor - Quit request from BOINC... 03:36:56 (270): called boinc_finish </stderr_txt> ]]> ================== The other task is hadam3p_n9f0_1992_2_006230294_3 It is awaiting uploading at 100% progress It\'s _2.zip file is awaiting transfer and, of course there\'s no stderr out message yet. Both tasks register 72,000 as the last trickle & with 2079 credits (with added 5%). And the final trickle of 72,096 is still missing. It has been seen in \"Show Graphics\" on BOINC Manager that Post processing does appear to start after the 72,096 T/step seems to complete without registering in the task details. It might be noted that proir to the recent stoppages and the 5% credit increases, etc., etc. the stderr out messages showed as :-- ================ <core_client_version>6.6.36</core_client_version> <![CDATA[ <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 15:36:53 (90410): called boinc_finish </stderr_txt> ]]> ======================= Hope this may help to diagnose the missing 72,096 problem on the Mac OSX (and others?) Keith |
Send message Joined: 3 Oct 06 Posts: 43 Credit: 8,017,057 RAC: 0 |
Yes, and others. Since the end of July I haven\'t had this final step go through. |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
Now that the 5% credit inflation seems to have been sorted out. Now have had the message on stderr for task hadam3p_n9f0_1992_2_006230294_3 It is : -- ============== <core_client_version>6.6.36</core_client_version> <![CDATA[ <stderr_txt> CPDN Monitor - Quit request from BOINC... Suspended CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... 22:44:23 (68934): No heartbeat from core client for 30 sec - exiting CPDN Monitor - No \'heartbeat\' from BOINC... CPDN Monitor - Quit request from BOINC... 04:14:47 (271): called boinc_finish </stderr_txt> ]]> ================== Hopefully this 72,096 missing problem will soon be history. Keith |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
Jim I see that both recent tasks finished at the same time without the final 72,096 trickle shown. stderr out message shows there as : -- ======================== core_client_version>6.6.36</core_client_version> <![CDATA[ <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... called boinc_finish </stderr_txt> ]]> ===================== Previously when a completed 72,096 got the full credit, the stderr out message was : -- ================ <core_client_version>6.6.36</core_client_version> <![CDATA[ <stderr_txt> CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... CPDN Monitor - Quit request from BOINC... called boinc_finish </stderr_txt> ]]> ======================= How relevant the difference is, I do not know (with 21 quit requests compared to only 10). Although I did see one other success at 18 quit requests!!!! I have been running HADAM3P on my Mac for some time and am trying to see a pattern in why the 72,096 trickle is always missing with the graph detail. Next time I shall try running only one HADAM3P task and see if that finishes properly. Keith |
Send message Joined: 29 Sep 04 Posts: 2363 Credit: 14,611,758 RAC: 0 |
I don\'t think that running just one HadAM3P at a time will make any difference to the final timestep. The v.6.08 HadAM3P that Tolu ran on the Beta project produced its final timestep, and the current version for Windows and Linux used to almost always produce it so I don\'t understand why none of the current models seem to be producing it on any of the 3 platforms. I\'m going to bring this to Tolu\'s attention again. It\'s as if the generation of this final ts depends on the batch of models, not the model version. If we don\'t get the final ts we don\'t get the graph either. Cpdn news |
Send message Joined: 5 Aug 04 Posts: 907 Credit: 299,864 RAC: 0 |
I talked to Tolu - 72,000 timesteps is the end of the run, the model just has an odd feature that it likes to run one more day afterwards. |
Send message Joined: 9 Jan 07 Posts: 467 Credit: 14,549,176 RAC: 317 |
Keith, The \"CPDN Monitor - Quit request from BOINC...\" isn\'t significant - it just records when BOINC stops, including when the user requests an exit. So, both the logs you report are effectively the same and are both \"clean\". The \"stderr out\" facility has its uses but also has some deficiencies, which start with the name: 1. \"stderr out\" is an appalling bit of computer jargon to which no normal human being should be exposed. Its principal defect is that it suggests that the associated log contains errors, when it may not. 2. The log has no dates or times, so people usually think that all these messages were created at the end of the run, which isn\'t the case - the log is for the whole run. 3. No distinction is made between errors, warnings and plain old recording of activity, so it\'s only by experience or knowledge (from where?) that the log becomes meaningful. 4. The error numbers and text come from various systems (BOINC, the science application, the operating system) in various formats and with no explanations. 5. The text randomly vanishes, so no systematic analysis of the logs is possible. It\'s really a facility intended for the project (but not, I suspect, used very often) and provided as a courtesy to the users. Treat with caution. Iain |
Send message Joined: 20 Feb 06 Posts: 158 Credit: 1,251,176 RAC: 0 |
Keith, Thanks for explanation, Iain. But, something is consistently different between the 2 sets of tasks, so I thought there might be an indication to what could be causing the result being curtailed at 72,000 time step (even though it was apparently following that by the post processing according to the graphics display for the task each time).. But, alas, it seems not, you reckon. The very reason I started this thread was to query the loss of the 72,096 trickle and also other info, which somewhat dampened the pleasure of getting a faster version 6.07, with which to crunch. The bug that was found and corrected, did not however cure the 72,096 problem when version 6.08 was released. This is the history of improvement for my Mac. Two tasks were completed each time:- Version 6.06 successfully to 72,096 Apr 17 Av 8.82 sec/ts Apr 23 Av 9.05 sec/ts Version 6.07 failing to get last trickle and graph July 9/10 Av 7.22 sec/ts July 20 Av 6.87 sec/ts July 26/27 Av 6.82 sec/ts Version 6.08 also failing to get last trickle and graph Aug 3 Av 6.80 sec/ts Aug 11 Av 6.74 sec/ts Aug 17 Av 6.67 sec/ts Keith |
Send message Joined: 5 Aug 04 Posts: 173 Credit: 1,843,046 RAC: 0 |
This is a common issue on all platforms. It must have occurred during the migration. I\'ll fix this asap. |
©2024 climateprediction.net