2021-01-26
§
|
07:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Promote db1138 to s4 master and remove read-only from s4 T271427', diff saved to https://phabricator.wikimedia.org/P13954 and previous config saved to /var/cache/conftool/dbconfig/20210126-070152-marostegui.json |
[production] |
07:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set s4 as read-only for maintenance T271427', diff saved to https://phabricator.wikimedia.org/P13953 and previous config saved to /var/cache/conftool/dbconfig/20210126-070037-marostegui.json |
[production] |
07:00 |
<marostegui> |
Starting s4 eqiad failover from db1081 to db1138 - T271427 |
[production] |
06:55 |
<ryankemper> |
Restarted `wdqs-blazegraph` on `wdqs1005` - its blazegraph was deadlocked (based on the presence of null values for the blazegraph metrics for that host) |
[production] |
05:43 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set candidate master to weight 0 before the failover T271427', diff saved to https://phabricator.wikimedia.org/P13952 and previous config saved to /var/cache/conftool/dbconfig/20210126-054337-marostegui.json |
[production] |
00:48 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2331.codfw.wmnet |
[production] |
00:47 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2318.codfw.wmnet |
[production] |
00:47 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2319.codfw.wmnet |
[production] |
00:46 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2320.codfw.wmnet |
[production] |
00:44 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2331.codfw.wmnet |
[production] |
00:43 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2318.codfw.wmnet |
[production] |
00:43 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2319.codfw.wmnet |
[production] |
00:42 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2320.codfw.wmnet |
[production] |
00:34 |
<legoktm@deploy1001> |
Synchronized wmf-config/CommonSettings.php: Invalidate configuration cache when logos.php is touched too (duration: 00m 56s) |
[production] |
00:32 |
<legoktm@deploy1001> |
Synchronized wmf-config/logos.php: Add script to mostly automate logo management (duration: 00m 55s) |
[production] |
00:16 |
<legoktm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Split $wmgSiteLogo{1,1_5,2}x to a separate logos.php (1/2) (duration: 01m 00s) |
[production] |
00:14 |
<legoktm@deploy1001> |
Synchronized wmf-config/logos.php: Split $wmgSiteLogo{1,1_5,2}x to a separate logos.php (1/2) (duration: 00m 56s) |
[production] |
00:08 |
<legoktm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: T272920: arbcom_enwiki: Change favicon to a renamed copy of arbcom_ruwiki.ico (2/2) (duration: 00m 58s) |
[production] |
00:07 |
<legoktm@deploy1001> |
Synchronized static/favicon/arbcom_enwiki.ico: T272920: arbcom_enwiki: Change favicon to a renamed copy of arbcom_ruwiki.ico (1/2) (duration: 01m 00s) |
[production] |
2021-01-25
§
|
23:09 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2318.codfw.wmnet with reason: REIMAGE |
[production] |
23:07 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2319.codfw.wmnet with reason: REIMAGE |
[production] |
23:06 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2318.codfw.wmnet with reason: REIMAGE |
[production] |
23:05 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2319.codfw.wmnet with reason: REIMAGE |
[production] |
23:03 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2331.codfw.wmnet with reason: REIMAGE |
[production] |
23:01 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2320.codfw.wmnet with reason: REIMAGE |
[production] |
23:00 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2331.codfw.wmnet with reason: REIMAGE |
[production] |
22:59 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2320.codfw.wmnet with reason: REIMAGE |
[production] |
22:44 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1338.eqiad.wmnet |
[production] |
22:34 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2322.codfw.wmnet |
[production] |
22:34 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2323.codfw.wmnet |
[production] |
22:30 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2322.codfw.wmnet |
[production] |
22:29 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2323.codfw.wmnet |
[production] |
22:29 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw1338.eqiad.wmnet |
[production] |
21:45 |
<cstone> |
civicrm revision changed from 3afb54f6f9 to dfb2ea2148 |
[production] |
21:11 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudgw2002-dev.codfw.wmnet with reason: REIMAGE |
[production] |
21:09 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1338.eqiad.wmnet with reason: REIMAGE |
[production] |
21:08 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2002-dev.codfw.wmnet with reason: REIMAGE |
[production] |
21:07 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1338.eqiad.wmnet with reason: REIMAGE |
[production] |
20:49 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2326.codfw.wmnet |
[production] |
20:46 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2326.codfw.wmnet |
[production] |
20:44 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1410.eqiad.wmnet |
[production] |
20:40 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw1410.eqiad.wmnet |
[production] |
20:35 |
<otto@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . |
[production] |
20:35 |
<otto@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . |
[production] |
20:25 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2323.codfw.wmnet with reason: REIMAGE |
[production] |
20:23 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2322.codfw.wmnet with reason: REIMAGE |
[production] |
20:23 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2323.codfw.wmnet with reason: REIMAGE |
[production] |
20:21 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2322.codfw.wmnet with reason: REIMAGE |
[production] |
20:19 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2326.codfw.wmnet with reason: REIMAGE |
[production] |
20:17 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw2326.codfw.wmnet with reason: REIMAGE |
[production] |