2021-01-26
ยง
|
12:12 |
<Urbanecm> |
[urbanecm@mwmaint1002 ~]$ mwscript extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php --wiki=trwikivoyage --cluster=all |
[production] |
12:07 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: eab535fcc983d57dd36c41309162ace8aadcae1a: Add namespace aliases to Turkish Wikivoyage (T272782) (duration: 01m 00s) |
[production] |
11:47 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
11:46 |
<akosiaris@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' . |
[production] |
11:44 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
11:29 |
<moritzm> |
imported jenkins 2.263.3 to apt.wikimedia.org (thirdparty/ci) |
[production] |
09:53 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1002.eqiad.wmnet |
[production] |
09:41 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host debmonitor1002.eqiad.wmnet |
[production] |
09:37 |
<elukey> |
reboot dbstore1005 for kernel upgrades |
[production] |
09:34 |
<urbanecm@deploy1001> |
Synchronized wmf-config/CommonSettings.php: Resync: Some mw2xxx hosts have old version (duration: 00m 55s) |
[production] |
09:32 |
<godog> |
disable mdadm check emails on ms-be1022 / known, and host is going to be decom'd - T267870 |
[production] |
09:29 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 6 hosts with reason: Restart mariadb to pick up config changes T272957 |
[production] |
09:29 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 0:30:00 on 6 hosts with reason: Restart mariadb to pick up config changes T272957 |
[production] |
09:28 |
<elukey> |
reboot dbstore1003 for kernel upgrades |
[production] |
09:24 |
<urbanecm@deploy1001> |
Synchronized wmf-config/logos.php: Resyncing to fix mw2xxx apache loading (duration: 00m 57s) |
[production] |
09:14 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) |
[production] |
09:14 |
<elukey> |
reboot dbstore1004 for kernel upgrades |
[production] |
09:13 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: eab87780: frwiki: Fix tagline height and width (T272907) (duration: 00m 58s) |
[production] |
09:12 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1078 (db1175 isn't ready yet)', diff saved to https://phabricator.wikimedia.org/P13959 and previous config saved to /var/cache/conftool/dbconfig/20210126-091236-marostegui.json |
[production] |
09:11 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1078 to clone db1175 T258361', diff saved to https://phabricator.wikimedia.org/P13958 and previous config saved to /var/cache/conftool/dbconfig/20210126-091149-marostegui.json |
[production] |
09:06 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
08:53 |
<marostegui> |
Stop mysql on db1081 to clone db1160 |
[production] |
08:44 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flerovium.eqiad.wmnet |
[production] |
08:39 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host flerovium.eqiad.wmnet |
[production] |
08:38 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker[1119,1131].eqiad.wmnet |
[production] |
08:37 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host furud.codfw.wmnet |
[production] |
08:36 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1119,1131].eqiad.wmnet |
[production] |
08:33 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host furud.codfw.wmnet |
[production] |
08:30 |
<godog> |
swift start decom for ms-be20[17,19,21,23,24,25,26,27] - T272837 |
[production] |
08:28 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1119.eqiad.wmnet with reason: REIMAGE |
[production] |
08:26 |
<elukey@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on an-worker1131.eqiad.wmnet with reason: REIMAGE |
[production] |
08:26 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1131.eqiad.wmnet with reason: REIMAGE |
[production] |
08:26 |
<elukey@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1119.eqiad.wmnet with reason: REIMAGE |
[production] |
08:19 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1160.eqiad.wmnet with reason: REIMAGE |
[production] |
08:18 |
<moritzm> |
upgrading OpenJDK on aqs and Hadoop systems |
[production] |
08:17 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1160.eqiad.wmnet with reason: REIMAGE |
[production] |
07:04 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1081 (s4 old master) - T271427', diff saved to https://phabricator.wikimedia.org/P13955 and previous config saved to /var/cache/conftool/dbconfig/20210126-070443-marostegui.json |
[production] |
07:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Promote db1138 to s4 master and remove read-only from s4 T271427', diff saved to https://phabricator.wikimedia.org/P13954 and previous config saved to /var/cache/conftool/dbconfig/20210126-070152-marostegui.json |
[production] |
07:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set s4 as read-only for maintenance T271427', diff saved to https://phabricator.wikimedia.org/P13953 and previous config saved to /var/cache/conftool/dbconfig/20210126-070037-marostegui.json |
[production] |
07:00 |
<marostegui> |
Starting s4 eqiad failover from db1081 to db1138 - T271427 |
[production] |
06:55 |
<ryankemper> |
Restarted `wdqs-blazegraph` on `wdqs1005` - its blazegraph was deadlocked (based on the presence of null values for the blazegraph metrics for that host) |
[production] |
05:43 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set candidate master to weight 0 before the failover T271427', diff saved to https://phabricator.wikimedia.org/P13952 and previous config saved to /var/cache/conftool/dbconfig/20210126-054337-marostegui.json |
[production] |
00:48 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2331.codfw.wmnet |
[production] |
00:47 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2318.codfw.wmnet |
[production] |
00:47 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2319.codfw.wmnet |
[production] |
00:46 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2320.codfw.wmnet |
[production] |
00:44 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2331.codfw.wmnet |
[production] |
00:43 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2318.codfw.wmnet |
[production] |
00:43 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2319.codfw.wmnet |
[production] |
00:42 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw2320.codfw.wmnet |
[production] |