5351-5400 of 10000 results (72ms)
2022-09-06 ยง
16:57 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
16:56 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
16:55 <pt1979@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004'] [production]
16:51 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
16:50 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
16:50 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
16:50 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
16:47 <jelto@cumin1001> END (PASS) - Cookbook sre.gitlab.reboot-runner (exit_code=0) rolling reboot on A:gitlab-runner [production]
16:45 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1004'] [production]
16:44 <pt1979@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004'] [production]
16:44 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1004'] [production]
16:42 <pt1979@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004'] [production]
16:36 <pt1979@cumin2002> END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['kafka-logging1004'] [production]
16:25 <btullis@deploy1002> helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main [production]
16:24 <btullis@deploy1002> helmfile [eqiad] START helmfile.d/services/datahub: apply on main [production]
16:23 <btullis@deploy1002> helmfile [codfw] DONE helmfile.d/services/datahub: sync on main [production]
16:22 <btullis@deploy1002> helmfile [codfw] START helmfile.d/services/datahub: apply on main [production]
16:22 <btullis@deploy1002> helmfile [staging] DONE helmfile.d/services/datahub: sync on main [production]
16:20 <btullis@deploy1002> helmfile [staging] START helmfile.d/services/datahub: apply on main [production]
16:18 <pt1979@cumin2002> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004'] [production]
16:12 <ayounsi@cumin1001> END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0) [production]
16:12 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1004.mgmt.eqiad.wmnet with reboot policy FORCED [production]
16:01 <jelto@cumin1001> START - Cookbook sre.gitlab.reboot-runner rolling reboot on A:gitlab-runner [production]
15:50 <marostegui@cumin1001> dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33968 and previous config saved to /var/cache/conftool/dbconfig/20220906-154959-root.json [production]
15:48 <ayounsi@cumin1001> START - Cookbook sre.network.prepare-upgrade [production]
15:44 <root@cumin1001> END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99) [production]
15:43 <root@cumin1001> START - Cookbook sre.network.prepare-upgrade [production]
15:43 <root@cumin1001> END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99) [production]
15:43 <root@cumin1001> START - Cookbook sre.network.prepare-upgrade [production]
15:34 <marostegui@cumin1001> dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33967 and previous config saved to /var/cache/conftool/dbconfig/20220906-153454-root.json [production]
15:21 <jelto@cumin1001> END (FAIL) - Cookbook sre.gitlab.reboot-runner (exit_code=1) rolling reboot on A:gitlab-runner [production]
15:20 <jelto@cumin1001> START - Cookbook sre.gitlab.reboot-runner rolling reboot on A:gitlab-runner [production]
15:19 <marostegui@cumin1001> dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33966 and previous config saved to /var/cache/conftool/dbconfig/20220906-151950-root.json [production]
15:15 <claime> Set wtp10[41-43].eqiad.wmnet inactive pending decommission T317025 [production]
15:14 <cgoubert@puppetmaster1001> conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1043.eqiad.wmnet [production]
15:14 <cgoubert@puppetmaster1001> conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1042.eqiad.wmnet [production]
15:14 <cgoubert@puppetmaster1001> conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1041.eqiad.wmnet [production]
15:12 <cgoubert@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1041-1043].eqiad.wmnet with reason: Downtiming replaced wtp servers [production]
15:12 <cgoubert@cumin1001> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1041-1043].eqiad.wmnet with reason: Downtiming replaced wtp servers [production]
15:09 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db2158 (T314041)', diff saved to https://phabricator.wikimedia.org/P33965 and previous config saved to /var/cache/conftool/dbconfig/20220906-150953-ladsgroup.json [production]
15:09 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance [production]
15:09 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance [production]
15:09 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance [production]
15:09 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance [production]
15:09 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33964 and previous config saved to /var/cache/conftool/dbconfig/20220906-150928-ladsgroup.json [production]
15:08 <claime> depooled wtp1045.eqiad.wmnet from parsoid cluster T307219 [production]
15:04 <marostegui@cumin1001> dbctl commit (dc=all): 'db1180 (re)pooling @ 25%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33963 and previous config saved to /var/cache/conftool/dbconfig/20220906-150445-root.json [production]
14:58 <claime> pooled parse1012.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219 [production]
14:55 <cgoubert@cumin1001> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1012.eqiad.wmnet [production]
14:55 <cgoubert@cumin1001> START - Cookbook sre.hosts.remove-downtime for parse1012.eqiad.wmnet [production]