301-350 of 10000 results (18ms)
2020-05-21 §
19:23 <andrewbogott> disabling puppet on cloudbackup2001 to prevent the backup job from starting during maintenance [admin]
19:16 <andrewbogott> systemctl disable block_sync-tools-project.service on cloudbackup2001.codfw.wmnet to avoid stepping on current upgrade [admin]
18:24 <twentyafterfour> restarting phabricator on phab1001 to deploy https://phabricator.wikimedia.org/rPHEX2687d08786a9dadcbaa96709de991f471f239830 [production]
17:24 <elukey> add druid100[7,8] to the druid public cluster (not serving load balancer traffic for the moment, only joining the cluster) - T252771 [analytics]
17:24 <bblack> anycast experiment done, all back to normal [production]
17:20 <bblack> anycast experimentation commencing in ulsfo (test route withdrawal)... [production]
17:04 <bstorm_> starting labstore1005 upgrades T224582 [production]
16:44 <elukey> roll restart druid historical nodes on druid100[4-6] (public cluster) to pick up new settings - T252771 [analytics]
16:42 <Reedy> Reloading Zuul to deploy https://gerrit.wikimedia.org/r/597825 [releng]
16:34 <Reedy> Reloading Zuul to deploy https://gerrit.wikimedia.org/r/597820 [releng]
16:19 <James_F> Zuul: [mediawiki/extensions/Bootstrap] Switch down to quibble-composer for now. [releng]
16:14 <andrew@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
16:12 <andrew@cumin1001> START - Cookbook sre.hosts.downtime [production]
16:04 <Urbanecm> Restart StewardBot [tools.stewardbots]
16:04 <sbassett@deploy1001> Synchronized private/PrivateSettings.php: Update mitigations for T250887 (duration: 01m 08s) [production]
16:01 <Urbanecm> Investigating StewardBot's outage [tools.stewardbots]
15:55 <Reedy> Reloading Zuul to deploy https://gerrit.wikimedia.org/r/597810 [releng]
15:48 <andrewbogott> rebuilding cloudnet1003.eqiad.wmnet with Debian Buster for T253124 [production]
15:48 <andrewbogott> re-imaging cloudnet1003 with Buster [admin]
15:23 <ZI_Jony> staff restarted CVNBot21 on #cvn-mediawiki [cvn]
15:22 <XioNoX> Add BGP between cr1/2-eqiad and authdns1001 - T253196 [production]
15:09 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
15:09 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
15:08 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
15:08 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
15:07 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
15:07 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:59 <dzahn@cumin1001> conftool action : set/pooled=inactive; selector: name=mw217[0-2].codfw.wmnet [production]
14:59 <dzahn@cumin1001> conftool action : set/pooled=inactive; selector: name=mw216[0-9].codfw.wmnet [production]
14:58 <dzahn@cumin1001> conftool action : set/pooled=inactive; selector: name=mw215[8-9].codfw.wmnet [production]
14:53 <bstorm_> adding the hiera values to horizon for bootstrapping k8s T211096 [paws]
14:50 <bblack@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
14:47 <bblack@cumin1001> START - Cookbook sre.hosts.downtime [production]
14:44 <akosiaris@deploy1001> helmfile [CODFW] Ran 'sync' command on namespace 'mathoid' for release 'canary' . [production]
14:39 <arturo> point record `k8s.svc.paws.eqiad1.wikimedia.cloud` to `172.16.1.186` (which is paws-k8s-control-1, for the initial bootstrap) (T211096) [paws]
14:33 <akosiaris> upload helmfile 0.109.0 to apt.wikimedia.org/buster-wikimedia and stretch-wikimedia, component main [production]
14:02 <elukey> restart druid kafka supervisor for wmf_netflow after maintenance [analytics]
13:53 <elukey> restart druid-historical on an-druid100[1,2] to pick up new settings [analytics]
13:51 <ZI_Jony> restarted Cubbie on #cvn-commons-uploads [cvn]
13:51 <vgutierrez> depool cp4032 for some ats tests [production]
13:22 <mutante> cloudnet1004 - reboot to test PXE boot [production]
13:17 <elukey> kill wmf_netflow druid supervisor for maintenance [analytics]
13:13 <elukey> stop druid-daemons on druid100[1-3] (one at the time) to move the druid partition from /srv/druid to /srv (didn't think about it before) - T252771 [analytics]
12:48 <arturo> created record `k8s.svc.paws.eqiad1.wikimedia.cloud` pointing to `172.16.0.191` (which is paws-k8s-haproxy-1) (T211096) [paws]
12:44 <andrewbogott> reimaging cloudnet1004.eqiad.wmnet for T253124 [production]
12:34 <arturo> created and transferred DNS zone `svc.paws.eqiad1.wikimedia.cloud` (T211096) [paws]
12:29 <elukey> roll restart druid-public cluster (druid100[4-6], backend for the AQS API) to apply new settings + openjdk upgrade - T252771 [production]
12:13 <mutante> depooled mw2158 through mw2172 to make room again in C3 as planned (T247018) [production]
12:12 <marostegui> Repool labsdb1011 into the analytics role 🤞- T249188 [production]
12:12 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw217[0-2].codfw.wmnet [production]