801-850 of 10000 results (36ms)
2021-09-07 ยง
14:23 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1024.eqiad.wmnet with reason: REIMAGE [production]
14:23 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1024.eqiad.wmnet with reason: REIMAGE [production]
14:22 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1023.eqiad.wmnet with reason: REIMAGE [production]
14:22 <cmjohnson@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1023.eqiad.wmnet with reason: REIMAGE [production]
14:17 <marostegui> No more db maintenance on eqiad T288594 [production]
14:08 <mutante> alert1001 - temp disabled puppet, stopped icinga-wm [production]
14:07 <mutante> temp killed icinga-wm because of flooding [production]
14:01 <Emperor> removing pc2010 from orchestrator T289117 [production]
13:59 <Emperor> removing pc2010 from tendril and zarcillo T289117 [production]
13:57 <pt1979@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
13:57 <XioNoX> drain esams-eqiad for circuit maintenance - T288503 [production]
13:54 <pt1979@cumin2002> START - Cookbook sre.dns.netbox [production]
13:51 <jayme> uncordoned kubestage2001 [production]
13:50 <jiji@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
13:49 <mutante> mw2264 - scap pulled and repooled after T290242 [production]
13:49 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw2264.codfw.wmnet [production]
13:43 <jiji@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
13:40 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2010.codfw.wmnet [production]
13:25 <mvernon@cumin1001> START - Cookbook sre.hosts.decommission for hosts pc2010.codfw.wmnet [production]
13:21 <Emperor> removing pc2009 from orchestrator T289116 [production]
13:21 <Emperor> removing pc2009 from tendril and zarcillo T289116 [production]
13:02 <marostegui@cumin1001> dbctl commit (dc=all): 'fix s8 weights T288594', diff saved to https://phabricator.wikimedia.org/P17248 and previous config saved to /var/cache/conftool/dbconfig/20210907-130244-marostegui.json [production]
12:59 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2009.codfw.wmnet [production]
12:51 <mvernon@deploy1002> Synchronized wmf-config/ProductionServices.php: Remove old decommissioned pc hosts T284825 (duration: 01m 02s) [production]
12:45 <mvernon@cumin1001> START - Cookbook sre.hosts.decommission for hosts pc2009.codfw.wmnet [production]
12:27 <marostegui@cumin1001> dbctl commit (dc=all): 'fix s1 weights T288594', diff saved to https://phabricator.wikimedia.org/P17247 and previous config saved to /var/cache/conftool/dbconfig/20210907-122747-marostegui.json [production]
12:27 <marostegui@cumin1001> dbctl commit (dc=all): 'fix s1 weights T288594', diff saved to https://phabricator.wikimedia.org/P17246 and previous config saved to /var/cache/conftool/dbconfig/20210907-122708-marostegui.json [production]
11:46 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 6 hosts [production]
11:46 <btullis@cumin1001> START - Cookbook sre.hosts.remove-downtime for 6 hosts [production]
11:36 <awight> EU backport complete [production]
11:33 <awight@deploy1002> Synchronized php-1.37.0-wmf.21/extensions/CodeMirror/extension.json: Backport: [[gerrit:719170|Change line numbers default to null (T290226)]] (duration: 00m 59s) [production]
11:28 <awight@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:717192|Set template namespace for code mirror line numbering (T290226)]] (duration: 00m 59s) [production]
10:51 <Emperor> removing pc2008 from orchestrator T289115 [production]
10:49 <Emperor> removing pc2008 from tendril and zarcillo T289115 [production]
10:46 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2008.codfw.wmnet [production]
10:35 <mvernon@cumin1001> START - Cookbook sre.hosts.decommission for hosts pc2008.codfw.wmnet [production]
10:29 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on 6 hosts with reason: commissioning aqs_new hosts [production]
10:29 <btullis@cumin1001> START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on 6 hosts with reason: commissioning aqs_new hosts [production]
10:29 <btullis@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on aqs1010.eqiad.wmnet with reason: commissioning aqs_new hosts [production]
10:29 <btullis@cumin1001> START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on aqs1010.eqiad.wmnet with reason: commissioning aqs_new hosts [production]
10:27 <Emperor> removing pc1010 from orchestrator T289122 [production]
10:22 <Emperor> removing pc1010 from tendril and zarcillo T289122 [production]
10:15 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1010.eqiad.wmnet [production]
10:02 <mvernon@cumin1001> START - Cookbook sre.hosts.decommission for hosts pc1010.eqiad.wmnet [production]
09:46 <Emperor> removing pc1009 from orchestrator T289120 [production]
09:26 <Emperor> removing pc1009 from tendril and zarcillo T289120 [production]
09:25 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1009.eqiad.wmnet [production]
09:16 <mvernon@cumin1001> START - Cookbook sre.hosts.decommission for hosts pc1009.eqiad.wmnet [production]
08:57 <elukey@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
08:53 <elukey@cumin1001> START - Cookbook sre.dns.netbox [production]