5001-5050 of 10000 results (44ms)
2022-01-13 ยง
09:49 <marostegui@cumin1001> START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye [production]
09:46 <marostegui@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye [production]
09:42 <marostegui@cumin1001> START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye [production]
09:40 <joal@deploy1002> Started deploy [analytics/refinery@94ec386]: Hotfix analytics deploy [analytics/refinery@94ec386] [production]
09:40 <joal@deploy1002> Finished deploy [analytics/refinery@94ec386] (thin): Hotfix analytics deploy THIN [analytics/refinery@94ec386] (duration: 00m 07s) [production]
09:40 <joal@deploy1002> Started deploy [analytics/refinery@94ec386] (thin): Hotfix analytics deploy THIN [analytics/refinery@94ec386] [production]
09:39 <joal@deploy1002> Finished deploy [analytics/refinery@94ec386] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@94ec386] (duration: 06m 59s) [production]
09:35 <marostegui@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye [production]
09:32 <joal@deploy1002> Started deploy [analytics/refinery@94ec386] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@94ec386] [production]
09:30 <marostegui@cumin1001> START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye [production]
09:30 <marostegui@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye [production]
09:26 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kafka-main1002.eqiad.wmnet with OS buster [production]
09:25 <marostegui@cumin1001> START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye [production]
09:24 <marostegui@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye [production]
09:16 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM xhgui1001.eqiad.wmnet [production]
09:14 <jmm@cumin2002> START - Cookbook sre.ganeti.reboot-vm for VM xhgui1001.eqiad.wmnet [production]
09:08 <marostegui@cumin1001> START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye [production]
09:03 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM lists1001.wikimedia.org [production]
09:02 <moritzm> rebooting lists1001 (running lists.wikimedia.org) to pick up new KVM setting [production]
09:00 <jmm@cumin2002> START - Cookbook sre.ganeti.reboot-vm for VM lists1001.wikimedia.org [production]
08:59 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es1022, give weight to es1021 T295965 ', diff saved to https://phabricator.wikimedia.org/P18718 and previous config saved to /var/cache/conftool/dbconfig/20220113-085906-marostegui.json [production]
08:42 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1003.eqiad.wmnet with OS buster [production]
08:39 <elukey> ipmi mc reset cold for kafka-main1002, mgmt interface not reachable via ssh [production]
08:39 <marostegui@cumin1001> dbctl commit (dc=all): 'Remove recentchanges group from s7 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P18717 and previous config saved to /var/cache/conftool/dbconfig/20220113-083923-marostegui.json [production]
08:28 <ladsgroup@deploy1002> Synchronized php-1.38.0-wmf.16/extensions/SpamBlacklist/includes/SpamBlacklistHooks.php: Backport: [[gerrit:753505|Take LogicException into consideration (T299111)]] (duration: 01m 28s) [production]
08:28 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [production]
08:27 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [production]
08:27 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [production]
08:23 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [production]
08:21 <ladsgroup@deploy1002> Synchronized php-1.38.0-wmf.17/extensions/SpamBlacklist/includes/SpamBlacklistHooks.php: Backport: [[gerrit:753504|Take LogicException into consideration (T299111)]] (duration: 01m 28s) [production]
08:13 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [production]
08:09 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [production]
08:09 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [production]
08:08 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [production]
08:08 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kafka-main1003.eqiad.wmnet with OS buster [production]
08:06 <marostegui> Change innodb_checksum_algorithm=full_crc32 on eqiad sanitarium hosts (db1154, db1155) T287244 [production]
08:02 <elukey> ipmi mc reset cold for kafka-main1003, mgmt interface not reachable via ssh [production]
07:57 <elukey> stop kafka* on kafka-main1003 as prep-step for reimage to buster [production]
07:50 <marostegui@cumin1001> dbctl commit (dc=all): 'Remove recentchangeslinked group from s7 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P18715 and previous config saved to /var/cache/conftool/dbconfig/20220113-075012-marostegui.json [production]
07:32 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbproxy1015.eqiad.wmnet with OS bullseye [production]
07:03 <marostegui@cumin1001> START - Cookbook sre.hosts.reimage for host dbproxy1015.eqiad.wmnet with OS bullseye [production]
06:42 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [production]
06:41 <ladsgroup@deploy1002> Synchronized php-1.38.0-wmf.16/includes/export/WikiExporter.php: Backport: [[gerrit:753501|export: Remove ignoring rev_page_id index (T163532)]] (duration: 01m 28s) [production]
06:41 <marostegui@cumin1001> dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: repooling after maintenance and reimage', diff saved to https://phabricator.wikimedia.org/P18714 and previous config saved to /var/cache/conftool/dbconfig/20220113-064113-root.json [production]
06:39 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [production]
06:39 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [production]
06:38 <marostegui> Failover m3 proxy from dbproxy1016 to dbproxy1020 T298586 [production]
06:38 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [production]
06:26 <marostegui> Remove rev_page_id from frwiki,jawiki,ruwiki and labswiki from db1096 (s6) T285149 [production]
06:26 <marostegui@cumin1001> dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: repooling after maintenance and reimage', diff saved to https://phabricator.wikimedia.org/P18713 and previous config saved to /var/cache/conftool/dbconfig/20220113-062609-root.json [production]