5051-5100 of 10000 results (43ms)
2021-07-27 ยง
16:43 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve1004.eqiad.wmnet [production]
16:42 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1002.eqiad.wmnet [production]
16:37 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve1002.eqiad.wmnet [production]
16:34 <herron@cumin1001> START - Cookbook sre.hosts.decommission for hosts logstash2020.codfw.wmnet [production]
16:26 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1001.eqiad.wmnet [production]
16:21 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve1001.eqiad.wmnet [production]
16:14 <dcausse@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . [production]
15:42 <elukey> add disk_template drbd back to ml-serve-ctrl100[12] vms after performance testing - T287238 [production]
15:22 <dcausse> cirrus: reindexing 823 wikis in elastic@[eqiad, codfw and cloudelastic] to apply new mapping (weighted_tags) T147505 [production]
15:22 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2004.codfw.wmnet [production]
15:17 <mmandere> pool lvs1014.eqiad.wmnet - T286061 [production]
15:16 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve2004.codfw.wmnet [production]
15:16 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2003.codfw.wmnet [production]
15:11 <marostegui> Move m1-master from dbproxy1012 to dbproxy1014 T286061 [production]
15:11 <mmandere> pool authdns1001.wikimedia.org - T286061 [production]
15:10 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve2003.codfw.wmnet [production]
15:09 <mmandere> pool cp10[79-82].eqiad.wmnet - T286061 [production]
15:05 <jmm@puppetmaster1001> conftool action : set/pooled=yes; selector: name=ldap-replica1003.wikimedia.org [production]
14:57 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2002.codfw.wmnet [production]
14:55 <mmandere@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on lvs1014.eqiad.wmnet with reason: Eqiad row B maintenance [production]
14:55 <mmandere@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on lvs1014.eqiad.wmnet with reason: Eqiad row B maintenance [production]
14:53 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1162 T287230', diff saved to https://phabricator.wikimedia.org/P16917 and previous config saved to /var/cache/conftool/dbconfig/20210727-145352-marostegui.json [production]
14:53 <moritzm> disabling puppet for upcoming row B maintenance [production]
14:52 <mmandere> depool lvs1014 - T286061 [production]
14:52 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve2002.codfw.wmnet [production]
14:52 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2001.codfw.wmnet [production]
14:51 <mmandere@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on authdns1001.wikimedia.org with reason: Eqiad row B maintenance [production]
14:51 <mmandere@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on authdns1001.wikimedia.org with reason: Eqiad row B maintenance [production]
14:48 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet [production]
14:47 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve2001.codfw.wmnet [production]
14:47 <mmandere@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp[1079-1082].eqiad.wmnet with reason: Eqiad row B maintenance [production]
14:46 <mmandere@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on cp[1079-1082].eqiad.wmnet with reason: Eqiad row B maintenance [production]
14:45 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet [production]
14:43 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1129.eqiad.wmnet with reason: REIMAGE [production]
14:41 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet [production]
14:40 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db1129.eqiad.wmnet with reason: REIMAGE [production]
14:40 <mmandere> depool authdns1001 - T286061 [production]
14:40 <elukey@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-serve-ctrl2002.codfw.wmnet [production]
14:36 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet [production]
14:34 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve-ctrl2002.codfw.wmnet [production]
14:33 <mmandere> depool cp10[79-82]).eqiad.wmnet - T286061 [production]
14:33 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve-ctrl2001.codfw.wmnet [production]
14:30 <topranks> Add peering to AS398196 - Cobalt Ridge at DE-CIX Dallas on cr2-codfw. [production]
14:29 <elukey> reduce vcores for ml-serve-ctrl[12]00[12] after performance testing - T287238 [production]
14:28 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve-ctrl2001.codfw.wmnet [production]
14:25 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1129 T287230', diff saved to https://phabricator.wikimedia.org/P16916 and previous config saved to /var/cache/conftool/dbconfig/20210727-142520-marostegui.json [production]
14:19 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
14:16 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . [production]
14:13 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
14:13 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . [production]