1001-1050 of 10000 results (106ms)
2023-02-27 ยง
15:52 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host wdqs2022.codfw.wmnet with OS bullseye [production]
15:52 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ml-etcd2001.codfw.wmnet with reason: etcd cluster upgrade failed, waiting for k8s upgrade [production]
15:52 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ml-etcd2001.codfw.wmnet with reason: etcd cluster upgrade failed, waiting for k8s upgrade [production]
15:48 <bking@cumin1001> END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) [production]
15:48 <root@cumin1001> END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['cloudcephosd1004'] [production]
15:44 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host dse-k8s-worker1008.eqiad.wmnet with OS bullseye [production]
15:43 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host dse-k8s-worker1005.eqiad.wmnet with OS bullseye [production]
15:41 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host wdqs2015.codfw.wmnet with OS bullseye [production]
15:41 <root@cumin1001> START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['cloudcephosd1004'] [production]
15:41 <dcaro@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1003.eqiad.wmnet with OS bullseye [production]
15:41 <dcaro@cumin1001> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - dcaro@cumin1001" [production]
15:40 <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:891503|cswiki: Grant changetags only to bots/sysops (T330383)]] (duration: 07m 39s) [production]
15:36 <dcaro@cumin1001> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - dcaro@cumin1001" [production]
15:35 <elukey@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=1) for host dse-k8s-worker1007.eqiad.wmnet with OS bullseye [production]
15:34 <urbanecm@deploy1002> urbanecm: Backport for [[gerrit:891503|cswiki: Grant changetags only to bots/sysops (T330383)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet [production]
15:33 <marostegui@cumin1001> dbctl commit (dc=all): 'db2180 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P44888 and previous config saved to /var/cache/conftool/dbconfig/20230227-153324-root.json [production]
15:33 <marostegui@cumin1001> dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P44887 and previous config saved to /var/cache/conftool/dbconfig/20230227-153318-root.json [production]
15:33 <marostegui@cumin1001> dbctl commit (dc=all): 'db2178 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P44886 and previous config saved to /var/cache/conftool/dbconfig/20230227-153313-root.json [production]
15:32 <urbanecm@deploy1002> Started scap: Backport for [[gerrit:891503|cswiki: Grant changetags only to bots/sysops (T330383)]] [production]
15:24 <cgoubert@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
15:24 <cgoubert@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
15:21 <dcaro@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1003.eqiad.wmnet with reason: host reimage [production]
15:18 <marostegui@cumin1001> dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P44884 and previous config saved to /var/cache/conftool/dbconfig/20230227-151836-root.json [production]
15:18 <dcaro@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1003.eqiad.wmnet with reason: host reimage [production]
15:18 <marostegui@cumin1001> dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P44883 and previous config saved to /var/cache/conftool/dbconfig/20230227-151826-root.json [production]
15:18 <marostegui@cumin1001> dbctl commit (dc=all): 'db2180 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P44882 and previous config saved to /var/cache/conftool/dbconfig/20230227-151819-root.json [production]
15:18 <marostegui@cumin1001> dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P44881 and previous config saved to /var/cache/conftool/dbconfig/20230227-151813-root.json [production]
15:18 <marostegui@cumin1001> dbctl commit (dc=all): 'db2178 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P44880 and previous config saved to /var/cache/conftool/dbconfig/20230227-151808-root.json [production]
15:14 <marostegui@cumin1001> dbctl commit (dc=all): 'db2122 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P44878 and previous config saved to /var/cache/conftool/dbconfig/20230227-151434-root.json [production]
15:13 <bking@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
15:13 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
15:12 <bking@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
15:12 <bking@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
15:11 <inflatador> bking@deploy1002 applying https://gerrit.wikimedia.org/r/c/operations/deployment-charts/+/891577 on dse-k8s-cluster via helmfile [production]
15:11 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1007.eqiad.wmnet with reason: host reimage [production]
15:08 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1007.eqiad.wmnet with reason: host reimage [production]
15:06 <dcaro@cumin1001> START - Cookbook sre.hosts.reimage for host cloudcephosd1003.eqiad.wmnet with OS bullseye [production]
15:05 <marostegui@cumin1001> dbctl commit (dc=all): 'db2175 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P44877 and previous config saved to /var/cache/conftool/dbconfig/20230227-150535-root.json [production]
15:04 <robh@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dumpsdata1006.eqiad.wmnet with OS bullseye [production]
15:03 <marostegui@cumin1001> dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P44876 and previous config saved to /var/cache/conftool/dbconfig/20230227-150331-root.json [production]
15:03 <marostegui@cumin1001> dbctl commit (dc=all): 'db1168 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P44875 and previous config saved to /var/cache/conftool/dbconfig/20230227-150322-root.json [production]
15:03 <marostegui@cumin1001> dbctl commit (dc=all): 'db2180 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P44874 and previous config saved to /var/cache/conftool/dbconfig/20230227-150315-root.json [production]
15:03 <marostegui@cumin1001> dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P44873 and previous config saved to /var/cache/conftool/dbconfig/20230227-150309-root.json [production]
15:03 <marostegui@cumin1001> dbctl commit (dc=all): 'db2178 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P44872 and previous config saved to /var/cache/conftool/dbconfig/20230227-150304-root.json [production]
15:01 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-etcd2001.codfw.wmnet with reason: host reimage [production]
14:59 <marostegui@cumin1001> dbctl commit (dc=all): 'db2122 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P44871 and previous config saved to /var/cache/conftool/dbconfig/20230227-145929-root.json [production]
14:56 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on ml-etcd2001.codfw.wmnet with reason: host reimage [production]
14:54 <elukey@cumin1001> START - Cookbook sre.hosts.reimage for host dse-k8s-worker1007.eqiad.wmnet with OS bullseye [production]
14:52 <robh@cumin1001> START - Cookbook sre.hosts.reimage for host dumpsdata1006.eqiad.wmnet with OS bullseye [production]
14:52 <robh@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dumpsdata1006.eqiad.wmnet with OS bullseye [production]