951-1000 of 10000 results (35ms)
2022-03-23 ยง
14:20 <bking@cumin1001> START - Cookbook sre.wdqs.reboot [production]
14:19 <bking@cumin1001> conftool action : set/pooled=yes; selector: name=wcqs1002.eqiad.wmnet [production]
14:18 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1029.eqiad.wmnet with OS bullseye [production]
14:11 <bking@cumin1001> END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99) [production]
14:10 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1027.eqiad.wmnet with OS bullseye [production]
14:09 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
14:08 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
14:08 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
14:06 <mmandere> pool cp1082 with HAProxy as TLS termination layer - T290005 [production]
14:04 <bking@cumin1001> START - Cookbook sre.wdqs.reboot [production]
14:04 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
14:04 <bking@cumin1001> END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99) [production]
14:04 <bking@cumin1001> START - Cookbook sre.wdqs.reboot [production]
14:04 <bking@cumin1001> END (FAIL) - Cookbook sre.wdqs.reboot (exit_code=99) [production]
14:04 <bking@cumin1001> START - Cookbook sre.wdqs.reboot [production]
14:00 <mmandere@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1082.eqiad.wmnet with OS buster [production]
14:00 <bking@cumin1001> END (PASS) - Cookbook sre.wdqs.reboot (exit_code=0) [production]
13:59 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1029.eqiad.wmnet with reason: host reimage [production]
13:59 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
13:58 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
13:58 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
13:57 <bking@cumin1001> START - Cookbook sre.wdqs.reboot [production]
13:55 <andrew@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1029.eqiad.wmnet with reason: host reimage [production]
13:54 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
13:51 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1010.eqiad.wmnet with OS bullseye [production]
13:50 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1027.eqiad.wmnet with reason: host reimage [production]
13:49 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
13:48 <Lucas_WMDE> UTC afternoon backport window done [production]
13:48 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
13:48 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
13:47 <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:773209|Enable Wikibase REST API on beta wikidata (T302959)]] (2/2, production no-op) (duration: 01m 05s) [production]
13:46 <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/Wikibase.php: Config: [[gerrit:773209|Enable Wikibase REST API on beta wikidata (T302959)]] (1/2, production no-op) (duration: 01m 07s) [production]
13:46 <andrew@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1027.eqiad.wmnet with reason: host reimage [production]
13:45 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1029.eqiad.wmnet with OS bullseye [production]
13:43 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
13:41 <marostegui@cumin1001> dbctl commit (dc=all): 'Depooling db1121 (T300775)', diff saved to https://phabricator.wikimedia.org/P23010 and previous config saved to /var/cache/conftool/dbconfig/20220323-134153-marostegui.json [production]
13:41 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
13:41 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 6 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
13:41 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1121.eqiad.wmnet with reason: Maintenance [production]
13:41 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db1121.eqiad.wmnet with reason: Maintenance [production]
13:41 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1141 (T300775)', diff saved to https://phabricator.wikimedia.org/P23009 and previous config saved to /var/cache/conftool/dbconfig/20220323-134140-marostegui.json [production]
13:39 <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:768090|Write "unexpectedUnconnectedPage" page prop on Test Wikidata clients]] (duration: 01m 10s) [production]
13:39 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1010.eqiad.wmnet with reason: host reimage [production]
13:38 <moritzm> restarting superset for OpenSSL update [production]
13:36 <mmandere@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1082.eqiad.wmnet with reason: host reimage [production]
13:35 <andrew@cumin1001> START - Cookbook sre.hosts.reimage for host cloudvirt1027.eqiad.wmnet with OS bullseye [production]
13:34 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1010.eqiad.wmnet with reason: host reimage [production]
13:33 <mmandere@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp1082.eqiad.wmnet with reason: host reimage [production]
13:26 <marostegui@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P23008 and previous config saved to /var/cache/conftool/dbconfig/20220323-132635-marostegui.json [production]
13:19 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kubernetes1010.eqiad.wmnet with OS bullseye [production]