2021-09-07
ยง
|
17:01 |
<jgiannelos@deploy1002> |
helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' . |
[production] |
16:39 |
<moritzm> |
installing jetty9 security updates on buster |
[production] |
16:30 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue |
[production] |
16:30 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue |
[production] |
16:30 |
<dancy@deploy1002> |
Synchronized README: testing (duration: 00m 59s) |
[production] |
16:25 |
<James_F> |
Docker: Publishing node12-test-browser-php{72,80}-composer images |
[releng] |
16:19 |
<James_F> |
Zuul: [mediawiki/extensions/BlueSpiceDistributionConnector] Add 4 dependencies |
[releng] |
15:27 |
<majavah> |
rolling out python3-prometheus-client updates |
[tools] |
15:18 |
<akosiaris> |
run_benchmarky.py against mwdebug.svc.codfw.wmnet for performance tests |
[production] |
15:07 |
<akosiaris@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
15:04 |
<jbond> |
upload python-prometheus-client_0.6.0 to stretch-wikimedia |
[production] |
14:50 |
<mutante> |
snapshot1015 - manually removed prometheus-puppet-agent-stats from crontab which was sending spam and is now a timer |
[production] |
14:41 |
<majavah> |
manually removing some absented but still present crontabs to stop root@ spam |
[tools] |
14:33 |
<mutante> |
CI - migrating zuul-merger cronjob to systemd timer (contint*) |
[production] |
14:23 |
<XioNoX> |
re-pool esams-eqiad - T288503 |
[production] |
14:23 |
<cmjohnson@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1024.eqiad.wmnet with reason: REIMAGE |
[production] |
14:23 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1024.eqiad.wmnet with reason: REIMAGE |
[production] |
14:22 |
<cmjohnson@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1023.eqiad.wmnet with reason: REIMAGE |
[production] |
14:22 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1023.eqiad.wmnet with reason: REIMAGE |
[production] |
14:17 |
<marostegui> |
No more db maintenance on eqiad T288594 |
[production] |
14:08 |
<mutante> |
alert1001 - temp disabled puppet, stopped icinga-wm |
[production] |
14:07 |
<mutante> |
temp killed icinga-wm because of flooding |
[production] |
14:01 |
<Emperor> |
removing pc2010 from orchestrator T289117 |
[production] |
13:59 |
<Emperor> |
removing pc2010 from tendril and zarcillo T289117 |
[production] |
13:57 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
13:57 |
<XioNoX> |
drain esams-eqiad for circuit maintenance - T288503 |
[production] |
13:54 |
<pt1979@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
13:51 |
<jayme> |
uncordoned kubestage2001 |
[production] |
13:50 |
<jiji@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
13:49 |
<mutante> |
mw2264 - scap pulled and repooled after T290242 |
[production] |
13:49 |
<dzahn@cumin1001> |
conftool action : set/pooled=yes; selector: name=mw2264.codfw.wmnet |
[production] |
13:43 |
<jiji@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
13:40 |
<mvernon@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2010.codfw.wmnet |
[production] |
13:25 |
<mvernon@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts pc2010.codfw.wmnet |
[production] |
13:21 |
<Emperor> |
removing pc2009 from orchestrator T289116 |
[production] |
13:21 |
<Emperor> |
removing pc2009 from tendril and zarcillo T289116 |
[production] |
13:10 |
<mdipietro> |
tab will close autocomplete window T289872 |
[quarry] |
13:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'fix s8 weights T288594', diff saved to https://phabricator.wikimedia.org/P17248 and previous config saved to /var/cache/conftool/dbconfig/20210907-130244-marostegui.json |
[production] |
12:59 |
<mvernon@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2009.codfw.wmnet |
[production] |
12:51 |
<mvernon@deploy1002> |
Synchronized wmf-config/ProductionServices.php: Remove old decommissioned pc hosts T284825 (duration: 01m 02s) |
[production] |
12:45 |
<mvernon@cumin1001> |
START - Cookbook sre.hosts.decommission for hosts pc2009.codfw.wmnet |
[production] |
12:27 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'fix s1 weights T288594', diff saved to https://phabricator.wikimedia.org/P17247 and previous config saved to /var/cache/conftool/dbconfig/20210907-122747-marostegui.json |
[production] |
12:27 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'fix s1 weights T288594', diff saved to https://phabricator.wikimedia.org/P17246 and previous config saved to /var/cache/conftool/dbconfig/20210907-122708-marostegui.json |
[production] |
11:46 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 6 hosts |
[production] |
11:46 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.remove-downtime for 6 hosts |
[production] |
11:41 |
<joal> |
Restarting cassandra hourly loading job after C2 snapshot taken and C3 tables truncated |
[analytics] |
11:37 |
<joal> |
Re-Add test rows in cassandra3 cluster after tables got truncated |
[analytics] |
11:36 |
<awight> |
EU backport complete |
[production] |
11:33 |
<awight@deploy1002> |
Synchronized php-1.37.0-wmf.21/extensions/CodeMirror/extension.json: Backport: [[gerrit:719170|Change line numbers default to null (T290226)]] (duration: 00m 59s) |
[production] |
11:28 |
<awight@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:717192|Set template namespace for code mirror line numbering (T290226)]] (duration: 00m 59s) |
[production] |