8301-8350 of 10000 results (120ms)
2023-01-03 ยง
14:48 <filippo@cumin1001> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: graphite1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - filippo@cumin1001" [production]
14:45 <taavi@deploy1002> taavi and matmarex: Backport for [[gerrit:874870|Revert "Revert "Start mobile DiscussionTools A/B test"" (T321961)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet [production]
14:44 <filippo@cumin1001> START - Cookbook sre.dns.netbox [production]
14:43 <taavi@deploy1002> Started scap: Backport for [[gerrit:874870|Revert "Revert "Start mobile DiscussionTools A/B test"" (T321961)]] [production]
14:41 <taavi@deploy1002> Finished scap: Backport for [[gerrit:874866|Log token for the DiscussionTools mobile a/b test (T321961)]], [[gerrit:874867|Log bucket/token for the DiscussionTools mobile a/b test (T321961)]], [[gerrit:874868|a/b test anonymous ID was being reset because of cookie prefixes (T321961)]], [[gerrit:874869|Log bucket/token for the DiscussionTools mobile a/b test (T321961)]] (duration: 08m 31s) [production]
14:39 <filippo@cumin1001> START - Cookbook sre.hosts.decommission for hosts graphite1004.eqiad.wmnet [production]
14:34 <taavi@deploy1002> taavi and matmarex: Backport for [[gerrit:874866|Log token for the DiscussionTools mobile a/b test (T321961)]], [[gerrit:874867|Log bucket/token for the DiscussionTools mobile a/b test (T321961)]], [[gerrit:874868|a/b test anonymous ID was being reset because of cookie prefixes (T321961)]], [[gerrit:874869|Log bucket/token for the DiscussionTools mobile a/b test (T321961)]] synced to the testservers: [production]
14:33 <taavi@deploy1002> Started scap: Backport for [[gerrit:874866|Log token for the DiscussionTools mobile a/b test (T321961)]], [[gerrit:874867|Log bucket/token for the DiscussionTools mobile a/b test (T321961)]], [[gerrit:874868|a/b test anonymous ID was being reset because of cookie prefixes (T321961)]], [[gerrit:874869|Log bucket/token for the DiscussionTools mobile a/b test (T321961)]] [production]
14:13 <oblivian@deploy1002> Finished scap: Backport for [[gerrit:841139|etcd: use the v3-style SRV record (T320397)]] (duration: 07m 58s) [production]
14:07 <oblivian@deploy1002> oblivian and oblivian: Backport for [[gerrit:841139|etcd: use the v3-style SRV record (T320397)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet [production]
14:05 <oblivian@deploy1002> Started scap: Backport for [[gerrit:841139|etcd: use the v3-style SRV record (T320397)]] [production]
13:55 <cgoubert@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0) [production]
13:46 <moritzm> installing libksba security updates [production]
13:24 <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet [production]
13:19 <jelto@cumin1001> START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet [production]
12:33 <taavi@deploy1002> Finished deploy [horizon/deploy@9d02cd6]: pushing wmf-puppet-dashboard updates for enc git handling (duration: 02m 49s) [production]
12:30 <taavi@deploy1002> Started deploy [horizon/deploy@9d02cd6]: pushing wmf-puppet-dashboard updates for enc git handling [production]
12:28 <taavi@deploy1002> Finished deploy [horizon/deploy@9d02cd6] (dev): pushing wmf-puppet-dashboard updates for enc git handling (duration: 01m 12s) [production]
12:27 <taavi@deploy1002> Started deploy [horizon/deploy@9d02cd6] (dev): pushing wmf-puppet-dashboard updates for enc git handling [production]
11:40 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2131', diff saved to https://phabricator.wikimedia.org/P42744 and previous config saved to /var/cache/conftool/dbconfig/20230103-114030-marostegui.json [production]
11:35 <cgoubert@cumin1001> START - Cookbook sre.hosts.reboot-cluster [production]
11:34 <cgoubert@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-cluster (exit_code=97) [production]
11:34 <cgoubert@cumin1001> START - Cookbook sre.hosts.reboot-cluster [production]
11:33 <cgoubert@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-cluster (exit_code=97) [production]
11:30 <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host contint2001.wikimedia.org [production]
11:26 <cgoubert@cumin1001> START - Cookbook sre.hosts.reboot-cluster [production]
11:25 <claime> Starting rolling reboot of parse* hosts in codfw [production]
11:06 <hashar> contint2001: starting Jenkins manually [production]
11:04 <marostegui> Change x1 binlog format to STATEMENT T255174 [production]
11:00 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on an-worker[1080,1084].eqiad.wmnet with reason: Shutting down to enable RAID battery replacement [production]
10:59 <btullis@cumin1001> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on an-worker[1080,1084].eqiad.wmnet with reason: Shutting down to enable RAID battery replacement [production]
10:59 <jelto@cumin1001> START - Cookbook sre.hosts.reboot-single for host contint2001.wikimedia.org [production]
10:58 <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host contint2002.wikimedia.org [production]
10:53 <marostegui> Restart eqiad sanitarium T326105 [production]
10:53 <jelto@cumin1001> START - Cookbook sre.hosts.reboot-single for host contint2002.wikimedia.org [production]
10:50 <marostegui> Restart codfw sanitarium masters T326105 [production]
10:49 <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host contint1002.wikimedia.org [production]
10:43 <jelto@cumin1001> START - Cookbook sre.hosts.reboot-single for host contint1002.wikimedia.org [production]
10:37 <cgoubert@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on parse1002.eqiad.wmnet with reason: CPU1 machine check error [production]
10:36 <cgoubert@cumin1001> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on parse1002.eqiad.wmnet with reason: CPU1 machine check error [production]
10:36 <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit1001.wikimedia.org [production]
10:31 <jelto@cumin1001> START - Cookbook sre.hosts.reboot-single for host gerrit1001.wikimedia.org [production]
10:25 <jelto@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2002.wikimedia.org [production]
10:18 <jelto@cumin1001> START - Cookbook sre.hosts.reboot-single for host gerrit2002.wikimedia.org [production]
09:27 <vgutierrez> restarting varnish on cp5032 to clear VarnishChildRestarted alert - T325797 [production]
08:19 <kartik@deploy1002> Finished scap: Backport for [[gerrit:869347|Content Translation: Move ttwiki out of Beta (T319177)]] (duration: 16m 09s) [production]
08:16 <jmm@puppetmaster1001> conftool action : set/pooled=inactive; selector: name=parse1002.eqiad.wmnet [production]
08:12 <moritzm> installing Linux 4.19.269 on Buster hosts [production]
08:12 <phedenskog@deploy1002> Finished deploy [performance/navtiming@4f8c010]: (no justification provided) (duration: 00m 08s) [production]
08:12 <phedenskog@deploy1002> Started deploy [performance/navtiming@4f8c010]: (no justification provided) [production]