251-300 of 10000 results (41ms)
2021-12-16 §
09:50 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd2004.codfw.wmnet with reason: switch to drbd storage [production]
09:46 <moritzm> added ganeti2028 to ganeti codfw cluster T294139 [production]
09:38 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on ganeti2015.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage [production]
09:38 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on ganeti2015.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage [production]
09:11 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2024.codfw.wmnet with OS buster [production]
08:55 <moritzm> drain primary/secondary instances off ganeti2015 T296622 [production]
08:43 <moritzm> switch ml-etcd2003 to DRBD-based storage to allow migration for reimages [production]
08:39 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
08:32 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
08:31 <urbanecm@deploy1002> Synchronized php-1.38.0-wmf.13/extensions/GrowthExperiments/: 35c055cead3d240625b76d21aa4e685525ca0d4b: MentorPageMentorManager: Do not fail hard with no mentor list configured (T297827) (duration: 01m 09s) [production]
08:20 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host ganeti2024.codfw.wmnet with OS buster [production]
08:07 <dcausse> restart blazegraph on wdqs1013 (jvm stuck for 4hours) [production]
03:27 <hoo> Stopped rebuildItemsPerSite on mwmaint1002 (was slightly beyond item Q72056756), as it has a memory leak (and would OOM in a few days) [production]
01:53 <mutante> miscweb1002 / miscweb2002 - both backends 'PASS: 26 requests sent to miscweb1002.eqiad.wmnet. All assertions passed.' again after fixing httpbb tests and T297605 [production]
01:50 <mutante> miscweb1002 - re-enabling puppet after deployment for T297605 [production]
01:03 <legoktm> removing current dump from static-codereview to replace it with a new one [production]
00:37 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
00:30 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
00:22 <legoktm> upgraded php7.2 on mw1414 for mysqlnd memory leak fix part 2 (T297667) [production]
00:19 <legoktm> uploaded 7.2.34-18+0~20210223.60+debian10~1.gbpb21322+wmf5 to buster-wikimedia for T297667 [production]
00:18 <ejegg> updated payments-wiki from df3ded67 to 55e605dd [production]
00:16 <mutante> miscweb1002 - disable puppet, deploying gerrit:747600 on miscweb2002 first, indeed puppet problem detected T297605 [production]
00:05 <legoktm> published new versions of php7.{2,4}-fpm-multiversion-base image with php-yaml extension (T296331) [production]
2021-12-15 §
23:38 <milimetric@deploy1002> Finished deploy [analytics/refinery@0d74de0] (thin): Pushing 0.1.23 for SparkSQLNCLIDriver job (THIN) (duration: 00m 07s) [production]
23:37 <milimetric@deploy1002> Started deploy [analytics/refinery@0d74de0] (thin): Pushing 0.1.23 for SparkSQLNCLIDriver job (THIN) [production]
23:26 <milimetric@deploy1002> Finished deploy [analytics/refinery@0d74de0]: Pushing 0.1.23 for SparkSQLNCLIDriver job (duration: 15m 35s) [production]
23:10 <milimetric@deploy1002> Started deploy [analytics/refinery@0d74de0]: Pushing 0.1.23 for SparkSQLNCLIDriver job [production]
23:10 <milimetric@deploy1002> Started deploy [analytics/refinery@0d74de0]: Pushing 0.1.23 for SparkSQLNCLIDriver job [production]
22:50 <legoktm> installing php-yaml on parsoid, jobrunners and maint servers [production]
20:52 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
20:52 <hashar@deploy1002> Synchronized php-1.38.0-wmf.13/includes/skins/Skin.php: Remove migration script - T297484 (duration: 01m 06s) [production]
20:51 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
20:46 <hashar@deploy1002> rebuilt and synchronized wikiversions files: Revert "group1 wikis to 1.38.0-wmf.13 refs T293954 [production]
20:45 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
20:44 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
20:12 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
20:11 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
20:05 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
20:04 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
20:04 <hashar@deploy1002> Synchronized php: group1 wikis to 1.38.0-wmf.13 refs T293954 (duration: 01m 05s) [production]
20:03 <hashar@deploy1002> rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.13 refs T293954 [production]
19:52 <pt1979@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-test-coord1002.eqiad.wmnet with OS buster [production]
19:45 <pt1979@cumin1001> START - Cookbook sre.hosts.reimage for host an-test-coord1002.eqiad.wmnet with OS buster [production]
19:30 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
19:29 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
19:27 <hashar> UTC evening backport window completed [production]
19:25 <hashar@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Enable tegola on enwiki T2980767 (duration: 01m 06s) [production]
19:22 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
19:22 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on ganeti2024.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage [production]
19:22 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on ganeti2024.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage [production]