2021-02-23
§
|
09:40 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:30:00 on 6 hosts with reason: Restart mariadb to pick up config changes T266913 |
[production] |
09:35 |
<moritzm> |
installing bind security updates on buster (client-side tools/libs) |
[production] |
09:10 |
<oblivian@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' . |
[production] |
09:10 |
<jayme@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . |
[production] |
09:06 |
<jayme@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' . |
[production] |
08:55 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin1001.eqiad.wmnet |
[production] |
08:50 |
<jayme@deploy1001> |
helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' . |
[production] |
08:47 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.reboot-single for host cumin1001.eqiad.wmnet |
[production] |
08:40 |
<Urbanecm> |
[urbanecm@mwmaint1002 ~/altwiki]$ mwscript namespaceDupes.php altwiki --fix |
[production] |
08:39 |
<urbanecm@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: 9f434e2966393f7911d04b5bf77e02eb11bb16ab: Add ВП as an alias for NS_PROJECT in altwiki (T271980) (duration: 00m 59s) |
[production] |
08:39 |
<Urbanecm> |
Run mwscript updateSpecialPages.php --wiki=altwiki |
[production] |
08:02 |
<oblivian@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' . |
[production] |
07:56 |
<oblivian@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' . |
[production] |
07:56 |
<oblivian@deploy1001> |
helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . |
[production] |
07:13 |
<hashar> |
Restarting CI Jenkins for plugin upgrade # T271683 |
[production] |
05:13 |
<krinkle@deploy1001> |
Finished deploy [integration/docroot@44d5685]: I307e8f4f6979 (duration: 00m 06s) |
[production] |
05:13 |
<krinkle@deploy1001> |
Started deploy [integration/docroot@44d5685]: I307e8f4f6979 |
[production] |
00:46 |
<eileen> |
civicrm revision changed from c535ac603a to 5e042e6e57, config revision is ef64f705bb |
[production] |
2021-02-22
§
|
23:59 |
<mutante> |
logstash2031 - systemctl reset-failed |
[production] |
23:53 |
<mutante> |
stat1007 - same problem and alerts as stat1004 |
[production] |
23:52 |
<mutante> |
stat1004 - systemctl reset-failed to clear icinga alerts for systemd state caused by jupyterhub singleuser services |
[production] |
23:47 |
<dpifke@deploy1001> |
Finished deploy [performance/arc-lamp@1f3bce1]: Revert https://gerrit.wikimedia.org/r/c/performance/arc-lamp/+/664600 (duration: 00m 05s) |
[production] |
23:47 |
<dpifke@deploy1001> |
Started deploy [performance/arc-lamp@1f3bce1]: Revert https://gerrit.wikimedia.org/r/c/performance/arc-lamp/+/664600 |
[production] |
23:37 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1286.eqiad.wmnet |
[production] |
23:36 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw1286.eqiad.wmnet |
[production] |
23:34 |
<milimetric@deploy1001> |
Finished deploy [analytics/refinery@3de01b5] (thin): Fix camus (duration: 00m 07s) |
[production] |
23:34 |
<milimetric@deploy1001> |
Started deploy [analytics/refinery@3de01b5] (thin): Fix camus |
[production] |
23:33 |
<milimetric@deploy1001> |
Finished deploy [analytics/refinery@3de01b5]: Fix camus (duration: 14m 03s) |
[production] |
23:27 |
<robh@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1014.eqiad.wmnet with reason: REIMAGE |
[production] |
23:25 |
<robh@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on aqs1014.eqiad.wmnet with reason: REIMAGE |
[production] |
23:22 |
<oblivian@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . |
[production] |
23:22 |
<oblivian@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' . |
[production] |
23:19 |
<milimetric@deploy1001> |
Started deploy [analytics/refinery@3de01b5]: Fix camus |
[production] |
23:18 |
<oblivian@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . |
[production] |
23:18 |
<oblivian@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' . |
[production] |
23:09 |
<ppchelko@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' . |
[production] |
23:09 |
<ppchelko@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . |
[production] |
23:06 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1410.eqiad.wmnet |
[production] |
23:06 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw1412.eqiad.wmnet |
[production] |
23:02 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw1412.eqiad.wmnet |
[production] |
23:00 |
<dzahn@cumin1001> |
conftool action : set/pooled=no; selector: name=mw1410.eqiad.wmnet |
[production] |
22:52 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1286.eqiad.wmnet with reason: REIMAGE |
[production] |
22:50 |
<legoktm> |
disabling puppet on mwdebug1001 to test https://gerrit.wikimedia.org/r/c/operations/puppet/+/664903 |
[production] |
22:49 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1286.eqiad.wmnet with reason: REIMAGE |
[production] |
22:45 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1412.eqiad.wmnet with reason: REIMAGE |
[production] |
22:43 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1410.eqiad.wmnet with reason: REIMAGE |
[production] |
22:42 |
<krinkle@deploy1001> |
Synchronized w/fatal-error.php: df694d695 (duration: 00m 56s) |
[production] |
22:42 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1412.eqiad.wmnet with reason: REIMAGE |
[production] |
22:41 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1410.eqiad.wmnet with reason: REIMAGE |
[production] |
22:31 |
<ppchelko@deploy1001> |
helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' . |
[production] |