751-800 of 10000 results (111ms)
2024-06-06 ยง
14:02 <kamila@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl1001.eqiad.wmnet with reason: host reimage [production]
14:00 <kartik@deploy1002> Finished scap: Backport for [[gerrit:1039571|CX: Fix translation container max width for large screens (T366374)]] (duration: 13m 11s) [production]
13:57 <fabfur@cumin1002> START - Cookbook sre.hosts.reboot-single for host cp4050.ulsfo.wmnet [production]
13:56 <fabfur@cumin1002> conftool action : set/pooled=no; selector: name=cp4050.ulsfo.wmnet [production]
13:52 <kartik@deploy1002> kartik: Continuing with sync [production]
13:50 <kartik@deploy1002> kartik: Backport for [[gerrit:1039571|CX: Fix translation container max width for large screens (T366374)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
13:47 <kamila@cumin1002> START - Cookbook sre.hosts.reimage for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye [production]
13:47 <kartik@deploy1002> Started scap: Backport for [[gerrit:1039571|CX: Fix translation container max width for large screens (T366374)]] [production]
13:46 <samtar@deploy1002> Finished scap: Backport for [[gerrit:1039612|[mswiktionary] Change the default Sitename value to Wikikamus (T366549)]] (duration: 16m 05s) [production]
13:45 <kamila@cumin1002> END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host wikikube-ctrl1001.eqiad.wmnet [production]
13:44 <kamila@cumin1002> START - Cookbook sre.hosts.dhcp for host wikikube-ctrl1001.eqiad.wmnet [production]
13:44 <kamila@cumin1002> END (PASS) - Cookbook sre.hosts.dhcp (exit_code=0) for host wikikube-ctrl1001.eqiad.wmnet [production]
13:37 <samtar@deploy1002> samtar and gergesshamon: Continuing with sync [production]
13:32 <samtar@deploy1002> samtar and gergesshamon: Backport for [[gerrit:1039612|[mswiktionary] Change the default Sitename value to Wikikamus (T366549)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
13:30 <samtar@deploy1002> Started scap: Backport for [[gerrit:1039612|[mswiktionary] Change the default Sitename value to Wikikamus (T366549)]] [production]
13:28 <samtar@deploy1002> Finished scap: Backport for [[gerrit:1038862|Activate campaignEvents extension on Igbo wiki. (T363199)]] (duration: 14m 07s) [production]
13:19 <samtar@deploy1002> mhorsey and samtar: Continuing with sync [production]
13:16 <samtar@deploy1002> mhorsey and samtar: Backport for [[gerrit:1038862|Activate campaignEvents extension on Igbo wiki. (T363199)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
13:15 <samtar@deploy1002> Started scap: Backport for [[gerrit:1038862|Activate campaignEvents extension on Igbo wiki. (T363199)]] [production]
13:11 <taavi> taavi@deploy1002 ~ $ sudo kill 32174 # kill forgotten scap sync-world process [production]
13:08 <klausman@cumin1002> END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad [production]
12:57 <vgutierrez> repool text@cofw with IPIP encapsulation enabled - T366466 [production]
12:56 <jiji@cumin1002> END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-worker-eqiad [production]
12:56 <isaranto@deploy1002> helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' . [production]
12:50 <vgutierrez> rolling restart of pybal on lvs2014 and lvs2011 - T366466 [production]
12:44 <topranks> disabling PyBal on lvs1019 to allow for cable move T366361 [production]
12:40 <fabfur@cumin1002> conftool action : set/pooled=yes; selector: name=cp4051.ulsfo.wmnet [production]
12:39 <topranks> rebooting ssw1-e1-eqiad to upgrade JunOS [production]
12:39 <fabfur@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4051.ulsfo.wmnet [production]
12:33 <topranks> disabling BGP to ssw1-e1-eqiad from cr1-eqiad in advance of upgrade T366361 [production]
12:33 <vgutierrez> depool text@codfw before enabling IPIP encapsulation - T366466 [production]
12:29 <fabfur@cumin1002> START - Cookbook sre.hosts.reboot-single for host cp4051.ulsfo.wmnet [production]
12:28 <fabfur@cumin1002> conftool action : set/pooled=no; selector: name=cp4051.ulsfo.wmnet [production]
12:25 <topranks> disabling PyBal on lvs1018 to allow for cable move T366361 [production]
12:25 <cmooney@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on lvs1018.eqiad.wmnet with reason: moving lvs1018 link to row E from spine to leaf [production]
12:25 <cmooney@cumin1002> START - Cookbook sre.hosts.downtime for 0:40:00 on lvs1018.eqiad.wmnet with reason: moving lvs1018 link to row E from spine to leaf [production]
12:24 <cmooney@cumin1002> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs1017.eqiad.wmnet [production]
12:24 <cmooney@cumin1002> START - Cookbook sre.hosts.remove-downtime for lvs1017.eqiad.wmnet [production]
12:21 <sfaci@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply [production]
12:21 <sfaci@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply [production]
12:14 <cmooney@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:30:00 on 18 hosts with reason: upgrading spine switches eqiad rows e and f [production]
12:14 <cmooney@cumin1002> START - Cookbook sre.hosts.downtime for 1:30:00 on 18 hosts with reason: upgrading spine switches eqiad rows e and f [production]
11:56 <topranks> disabling PyBal on lvs1017 to allow for cable move T366361 [production]
11:55 <cmooney@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:40:00 on lvs1017.eqiad.wmnet with reason: moving lvs1017 link to row E from spine to leaf [production]
11:55 <cmooney@cumin1002> START - Cookbook sre.hosts.downtime for 0:40:00 on lvs1017.eqiad.wmnet with reason: moving lvs1017 link to row E from spine to leaf [production]
11:28 <cgoubert@cumin1002> END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-worker-codfw [production]
11:27 <effie> kicking off k8s eqiad restarts - T366555 [production]
11:25 <jiji@cumin1002> START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-worker-eqiad [production]
11:15 <hnowlan@deploy1002> helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply [production]
11:09 <klausman@cumin1002> START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad [production]