4551-4600 of 10000 results (56ms)
2021-12-16 ยง
15:02 <pt1979@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
15:01 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1001.wikimedia.org [production]
14:58 <pt1979@cumin2002> START - Cookbook sre.dns.netbox [production]
14:58 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host idp-test1001.wikimedia.org [production]
14:58 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
14:57 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
14:55 <elukey> shutdown kafka-main2001 for BIOS+NIC firmware upgrades [production]
14:51 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
14:50 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
14:44 <moritzm> drain primary/secondary instances off ganeti2007 T296622 [production]
13:43 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
13:38 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
13:33 <volans> upgraded spicerack to v1.1.0 on cumin[1001,2001] [production]
13:27 <mbsantos@deploy1002> Finished deploy [kartotherian/deploy@e36c241] (eqiad): Change osm-intl and osm source to get MVT from Tegola (Full production for Tegola) (duration: 01m 39s) [production]
13:25 <mbsantos@deploy1002> Started deploy [kartotherian/deploy@e36c241] (eqiad): Change osm-intl and osm source to get MVT from Tegola (Full production for Tegola) [production]
13:24 <mbsantos@deploy1002> Finished deploy [kartotherian/deploy@e36c241] (codfw): (no justification provided) (duration: 03m 12s) [production]
13:21 <mbsantos@deploy1002> Started deploy [kartotherian/deploy@e36c241] (codfw): (no justification provided) [production]
13:18 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2024.codfw.wmnet [production]
13:17 <volans> uploaded spicerack_1.1.0 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia [production]
13:12 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti2024.codfw.wmnet [production]
12:45 <oblivian@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
12:39 <oblivian@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
12:34 <oblivian@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
12:33 <oblivian@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
12:14 <kharlan@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:747677|Enable WelcomeSurvey Interaction schema (T267273 T297858)]] (duration: 01m 07s) [production]
12:14 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
12:13 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
12:11 <ayounsi@cumin1001> END (PASS) - Cookbook sre.network.cf (exit_code=0) [production]
12:11 <ayounsi@cumin1001> START - Cookbook sre.network.cf [production]
11:08 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main2003.codfw.wmnet with OS buster [production]
10:59 <btullis> pushed new packages for druid version 0.19.0-2 on buster using reprepro [production]
10:31 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kafka-main2003.codfw.wmnet with OS buster [production]
10:28 <elukey> second attempt to reimage kafka-main2003 to buster [production]
10:09 <moritzm> drain primary/secondary instances off ganeti2007 T296622 [production]
10:04 <moritzm> switched kubetcd2004 to DRBD-based storage to allow migration for reimages [production]
09:50 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kubetcd2004.codfw.wmnet with reason: switch to drbd storage [production]
09:50 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 1:00:00 on kubetcd2004.codfw.wmnet with reason: switch to drbd storage [production]
09:46 <moritzm> added ganeti2028 to ganeti codfw cluster T294139 [production]
09:38 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on ganeti2015.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage [production]
09:38 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on ganeti2015.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage [production]
09:11 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2024.codfw.wmnet with OS buster [production]
08:55 <moritzm> drain primary/secondary instances off ganeti2015 T296622 [production]
08:43 <moritzm> switch ml-etcd2003 to DRBD-based storage to allow migration for reimages [production]
08:39 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
08:32 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
08:31 <urbanecm@deploy1002> Synchronized php-1.38.0-wmf.13/extensions/GrowthExperiments/: 35c055cead3d240625b76d21aa4e685525ca0d4b: MentorPageMentorManager: Do not fail hard with no mentor list configured (T297827) (duration: 01m 09s) [production]
08:20 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host ganeti2024.codfw.wmnet with OS buster [production]
08:07 <dcausse> restart blazegraph on wdqs1013 (jvm stuck for 4hours) [production]
03:27 <hoo> Stopped rebuildItemsPerSite on mwmaint1002 (was slightly beyond item Q72056756), as it has a memory leak (and would OOM in a few days) [production]
01:53 <mutante> miscweb1002 / miscweb2002 - both backends 'PASS: 26 requests sent to miscweb1002.eqiad.wmnet. All assertions passed.' again after fixing httpbb tests and T297605 [production]