2021-07-28
§
|
14:33 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on mw1434.eqiad.wmnet with reason: known issue |
[production] |
14:33 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on mw1434.eqiad.wmnet with reason: known issue |
[production] |
14:19 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
14:06 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1436.eqiad.wmnet with reason: REIMAGE |
[production] |
14:06 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
14:06 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
14:06 |
<dcausse@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . |
[production] |
14:04 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1435.eqiad.wmnet with reason: REIMAGE |
[production] |
14:03 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1436.eqiad.wmnet with reason: REIMAGE |
[production] |
14:01 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mw1435.eqiad.wmnet with reason: REIMAGE |
[production] |
13:32 |
<dzahn@cumin1001> |
conftool action : set/pooled=inactive; selector: name=mw143[4-6].eqiad.wmnet |
[production] |
13:29 |
<moritzm> |
installing python2.7 security updates on stretch |
[production] |
13:08 |
<moritzm> |
installing python3.5 security updates on stretch |
[production] |
12:27 |
<dcausse@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . |
[production] |
11:26 |
<moritzm> |
installing nginx security updates on thumbor* |
[production] |
11:18 |
<moritzm> |
installing nginx security updates on sodium (mirrors.wikimedia.org) |
[production] |
11:03 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue |
[production] |
11:03 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue |
[production] |
10:11 |
<moritzm> |
installing remaining nginx security updates on stretch |
[production] |
10:09 |
<godog> |
temp fix prometheus-icinga-am on alert1001 |
[production] |
09:40 |
<dcausse@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . |
[production] |
09:40 |
<urbanecm> |
Start server-side upload for 1 video file (T287482) |
[production] |
09:29 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
09:29 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
09:28 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
09:24 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
09:24 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
08:33 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1122.eqiad.wmnet with reason: REIMAGE |
[production] |
08:31 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1122.eqiad.wmnet with reason: REIMAGE |
[production] |
08:27 |
<Amir1> |
running several long-running queries against pc1007 |
[production] |
08:13 |
<oblivian@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
08:01 |
<dcausse@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' . |
[production] |
07:53 |
<moritzm> |
installing aspell security updates on stretch |
[production] |
07:20 |
<dcaro@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 29 hosts with reason: T287559 |
[production] |
07:20 |
<dcaro@cumin1001> |
START - Cookbook sre.hosts.downtime for 5:00:00 on 29 hosts with reason: T287559 |
[production] |
07:20 |
<dcaro@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 40 hosts with reason: T287559 |
[production] |
07:20 |
<dcaro@cumin1001> |
START - Cookbook sre.hosts.downtime for 5:00:00 on 40 hosts with reason: T287559 |
[production] |
07:20 |
<dcaro@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 6 hosts with reason: T287559 |
[production] |
07:20 |
<dcaro@cumin1001> |
START - Cookbook sre.hosts.downtime for 5:00:00 on 6 hosts with reason: T287559 |
[production] |
07:07 |
<godog> |
remove cloud*/syslog.log from centrallog2001 - T287559 |
[production] |
07:06 |
<godog> |
remove node_pinger.prom from node-pinger hosts |
[production] |
06:42 |
<godog> |
remove obsolete user.log.manual-rotation from centrallog1001 to free disk space |
[production] |
02:43 |
<TimStarling> |
on mwmaint2002 fixing T286273 broken files using eval.php |
[production] |
2021-07-27
§
|
23:53 |
<thcipriani@deploy1002> |
Synchronized php-1.37.0-wmf.16/skins/Vector: Backport: [[gerrit:708220|Restore print, links, table and message box styles (T278896)]] (duration: 01m 07s) |
[production] |
23:15 |
<cjming@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:708152|Enable user links on office + test wikis (T287391)]] (duration: 02m 00s) |
[production] |
20:44 |
<ryankemper> |
[WDQS] Returning `wdqs` dns discovery to the expected status of `(eqiad, codfw) = (depooled, pooled)`: `sudo confctl --object-type discovery select 'dnsdisc=wdqs,name=eqiad' set/pooled=false` |
[production] |
20:44 |
<legoktm> |
legoktm@wtp1025:~$ sudo systemctl restart php7.2-fpm # restart php-fpm, opcache hit ratio was warning |
[production] |
20:43 |
<ryankemper@puppetmaster1001> |
conftool action : set/pooled=false; selector: dnsdisc=wdqs,name=eqiad |
[production] |
20:35 |
<twentyafterfour@deploy1002> |
Pruned MediaWiki: 1.37.0-wmf.14 (duration: 03m 12s) |
[production] |
20:25 |
<twentyafterfour@deploy1002> |
rebuilt and synchronized wikiversions files: group0 wikis to 1.37.0-wmf.16 |
[production] |