|
2020-02-27
§
|
| 00:53 |
<bstorm_> |
stopped the start job in this tool to see if it was the source of a large IO spike |
[tools.hat-collector] |
| 00:52 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
| 00:49 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
| 00:42 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
| 00:39 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
| 00:29 |
<bd808> |
Drained tools-worker-1009 for reboot (NFS flakey) |
[tools] |
| 00:21 |
<jforrester@deploy1001> |
Synchronized w/extract2.php: T239975: Use Article::getPage()->getTouched(), not Article::getTouched (duration: 01m 04s) |
[production] |
| 00:17 |
<jforrester@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Bonus sync for cache clearance (duration: 01m 04s) |
[production] |
| 00:15 |
<jforrester@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: T232140: Merge definition of wgLogos and wgLogo (duration: 01m 04s) |
[production] |
| 00:13 |
<jforrester@deploy1001> |
Synchronized wmf-config/CommonSettings.php: T232140: Stop setting wgLogoHD from wgLogos (duration: 01m 05s) |
[production] |
| 00:11 |
<bd808> |
Uncordoned tools-worker-1009.tools.eqiad.wmflabs |
[tools] |
| 00:08 |
<bd808> |
Uncordoned tools-worker-1002.tools.eqiad.wmflabs |
[tools] |
| 00:02 |
<bd808> |
Rebooting tools-worker-1002 |
[tools] |
| 00:02 |
<jforrester@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Bonus sync for cache clearance (duration: 01m 03s) |
[production] |
| 00:01 |
<jforrester@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: T246212 Stop setting wgULSLanguageDetection in IS, set in CS (duration: 01m 05s) |
[production] |
| 00:00 |
<bd808> |
Draining tools-worker-1002 to reboot for NFS problems |
[tools] |
|
2020-02-26
§
|
| 23:59 |
<jforrester@deploy1001> |
Synchronized wmf-config/CommonSettings.php: T246212 Set wgULSLanguageDetection false in CS (duration: 01m 04s) |
[production] |
| 23:55 |
<jforrester@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: Bonus sync for cache clearance (duration: 01m 04s) |
[production] |
| 23:54 |
<James_F> |
jforrester@deploy1001 Synchronized wmf-config/InitialiseSettings.php: T246193 Stop setting wgAllowTitlesInSVG, never read (and this was default anyway) (duration: 01m 05s) |
[production] |
| 23:42 |
<bd808> |
Drained tools-worker-1040 |
[tools] |
| 23:41 |
<bd808> |
Cordoned tools-worker-10[16-40] in preparation for shrinking legacy Kubernetes cluster |
[tools] |
| 23:19 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
| 23:16 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
| 23:16 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
| 23:15 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
| 23:12 |
<bstorm_> |
replacing all tool limit-ranges in the 2020 cluster with a lower cpu request version |
[tools] |
| 22:58 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
| 22:58 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
| 22:58 |
<dzahn@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
| 22:58 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
| 22:51 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
| 22:49 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
| 22:48 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
| 22:47 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
| 22:44 |
<foks> |
removing one file for legal compliance |
[production] |
| 22:29 |
<bstorm_> |
deleted pod maintain-kubeusers-6d9c45f4bc-5bqq5 to deploy new image |
[tools] |
| 22:27 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
| 22:25 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
| 22:19 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
| 22:16 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
| 22:03 |
<jeh> |
powering down cloudvirt1014 for hardware maintenance |
[admin] |
| 21:51 |
<Urbanecm> |
Password reset for User:Joax (T242941) |
[production] |
| 21:45 |
<James_F> |
Docker: Publishing mediawiki-phan-testrun:0.1.0 for T226117 |
[releng] |
| 21:28 |
<mutante> |
ganeti - shutting apt2001 down again |
[production] |
| 21:17 |
<ladsgroup@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: [[gerrit:574454|Decrease the reads for term store for clients down to Q2Mio (T219123)]], take II (duration: 01m 04s) |
[production] |
| 21:15 |
<ladsgroup@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: [[gerrit:574454|Decrease the reads for term store for clients down to Q2Mio (T219123)]] (duration: 01m 04s) |
[production] |
| 21:15 |
<mutante> |
ganeti - re-starting apt2001 which is mysteriously broken and "half up" ..as in you can't ssh to it and don't get console but it does cause icinga alerts |
[production] |
| 21:06 |
<bstorm_> |
deleting loads of stuck grid jobs |
[tools] |
| 20:35 |
<ladsgroup@deploy1001> |
Synchronized php-1.35.0-wmf.21/extensions/Wikibase/lib/includes/Store/Sql/Terms: SWAT: [[gerrit:575055|Do prefetching entity ids on batches of 20 entity per query (T246159)]] (duration: 01m 04s) |
[production] |
| 20:35 |
<bstorm_> |
hard rebooting the grid master for toolsbeta |
[toolsbeta] |