1751-1800 of 10000 results (46ms)
2022-01-13 ยง
10:27 <marostegui@cumin1001> dbctl commit (dc=all): 'es1022 (re)pooling @ 1%: repooling after reimage', diff saved to https://phabricator.wikimedia.org/P18719 and previous config saved to /var/cache/conftool/dbconfig/20220113-102734-root.json [production]
10:27 <moritzm> systemctl reset-failed ifup@ens5.service on lists1001 T273026 [production]
10:13 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM grafana1002.eqiad.wmnet [production]
10:10 <moritzm> rebooting grafana1002 (running grafana.wikimedia.org) [production]
10:10 <jmm@cumin2002> START - Cookbook sre.ganeti.reboot-vm for VM grafana1002.eqiad.wmnet [production]
10:09 <marostegui@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye [production]
10:02 <mmandere> cp3052: upgrade varnish to 6.0.9-1wm1 T298758 [production]
10:02 <joal@deploy1002> Finished deploy [analytics/refinery@94ec386]: Hotfix analytics deploy [analytics/refinery@94ec386] (duration: 21m 47s) [production]
10:02 <elukey> run kafka preferred-replica-election on kafka-main1001 to force a rebalance of partition leaders (after kafka-main1002's reimage) [production]
10:00 <btullis@cumin1001> END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM kafka-test1006.eqiad.wmnet [production]
09:59 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1002.eqiad.wmnet with OS buster [production]
09:56 <btullis@cumin1001> START - Cookbook sre.ganeti.reboot-vm for VM kafka-test1006.eqiad.wmnet [production]
09:49 <marostegui@cumin1001> START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye [production]
09:46 <marostegui@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye [production]
09:42 <marostegui@cumin1001> START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye [production]
09:40 <joal@deploy1002> Started deploy [analytics/refinery@94ec386]: Hotfix analytics deploy [analytics/refinery@94ec386] [production]
09:40 <joal@deploy1002> Finished deploy [analytics/refinery@94ec386] (thin): Hotfix analytics deploy THIN [analytics/refinery@94ec386] (duration: 00m 07s) [production]
09:40 <joal@deploy1002> Started deploy [analytics/refinery@94ec386] (thin): Hotfix analytics deploy THIN [analytics/refinery@94ec386] [production]
09:39 <joal@deploy1002> Finished deploy [analytics/refinery@94ec386] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@94ec386] (duration: 06m 59s) [production]
09:35 <marostegui@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye [production]
09:32 <joal@deploy1002> Started deploy [analytics/refinery@94ec386] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@94ec386] [production]
09:30 <marostegui@cumin1001> START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye [production]
09:30 <marostegui@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye [production]
09:26 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kafka-main1002.eqiad.wmnet with OS buster [production]
09:25 <marostegui@cumin1001> START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye [production]
09:24 <marostegui@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host es1022.eqiad.wmnet with OS bullseye [production]
09:16 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM xhgui1001.eqiad.wmnet [production]
09:14 <jmm@cumin2002> START - Cookbook sre.ganeti.reboot-vm for VM xhgui1001.eqiad.wmnet [production]
09:08 <marostegui@cumin1001> START - Cookbook sre.hosts.reimage for host es1022.eqiad.wmnet with OS bullseye [production]
09:03 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM lists1001.wikimedia.org [production]
09:02 <moritzm> rebooting lists1001 (running lists.wikimedia.org) to pick up new KVM setting [production]
09:00 <jmm@cumin2002> START - Cookbook sre.ganeti.reboot-vm for VM lists1001.wikimedia.org [production]
08:59 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool es1022, give weight to es1021 T295965 ', diff saved to https://phabricator.wikimedia.org/P18718 and previous config saved to /var/cache/conftool/dbconfig/20220113-085906-marostegui.json [production]
08:42 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-main1003.eqiad.wmnet with OS buster [production]
08:39 <elukey> ipmi mc reset cold for kafka-main1002, mgmt interface not reachable via ssh [production]
08:39 <marostegui@cumin1001> dbctl commit (dc=all): 'Remove recentchanges group from s7 eqiad T263127', diff saved to https://phabricator.wikimedia.org/P18717 and previous config saved to /var/cache/conftool/dbconfig/20220113-083923-marostegui.json [production]
08:28 <ladsgroup@deploy1002> Synchronized php-1.38.0-wmf.16/extensions/SpamBlacklist/includes/SpamBlacklistHooks.php: Backport: [[gerrit:753505|Take LogicException into consideration (T299111)]] (duration: 01m 28s) [production]
08:28 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [production]
08:27 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [production]
08:27 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [production]
08:23 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [production]
08:21 <ladsgroup@deploy1002> Synchronized php-1.38.0-wmf.17/extensions/SpamBlacklist/includes/SpamBlacklistHooks.php: Backport: [[gerrit:753504|Take LogicException into consideration (T299111)]] (duration: 01m 28s) [production]
08:13 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [production]
08:09 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply on pinkunicorn [production]
08:09 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync on pinkunicorn [production]
08:08 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn [production]
08:08 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host kafka-main1003.eqiad.wmnet with OS buster [production]
08:06 <marostegui> Change innodb_checksum_algorithm=full_crc32 on eqiad sanitarium hosts (db1154, db1155) T287244 [production]
08:02 <elukey> ipmi mc reset cold for kafka-main1003, mgmt interface not reachable via ssh [production]
07:57 <elukey> stop kafka* on kafka-main1003 as prep-step for reimage to buster [production]