2021-02-28
§
|
19:20 |
<wm-bot> |
<lucaswerkmeister> deployed bbca6e5b8e (better OAuth error handling) |
[tools.ranker] |
19:17 |
<wm-bot> |
<lucaswerkmeister> deployed 03d707756b (fix return type, should be a no-op) |
[tools.quickcategories] |
19:15 |
<wm-bot> |
<lucaswerkmeister> deployed a01dae7728 (better OAuth error handling) |
[tools.quickcategories] |
19:04 |
<wm-bot> |
<lucaswerkmeister> deployed a543196e25 (better OAuth error handling) |
[tools.speedpatrolling] |
18:55 |
<wm-bot> |
<lucaswerkmeister> deployed ae6a228597 (better OAuth error handling) |
[tools.wd-image-positions] |
18:19 |
<legoktm> |
added Majavah as a maintainer |
[tools.wikibugs] |
18:17 |
<legoktm> |
manually stopped all jobs and started them |
[tools.wikibugs] |
17:18 |
<wm-bot> |
<lucaswerkmeister> deployed 369031b945 (minifix) |
[tools.lexeme-forms] |
17:10 |
<wm-bot> |
<lucaswerkmeister> deployed 0455dc20f4 (better OAuth error handling) |
[tools.lexeme-forms] |
14:17 |
<gehel> |
repooled wdqs1011 - catched up on lag |
[production] |
04:54 |
<andrewbogott> |
restarted redis-server on tools-redis-1003 and tools-redis-1004 in an attempt to reduce replag, no real change detected |
[admin] |
2021-02-27
§
|
22:03 |
<Reedy> |
re-armed beta keyholder... I think... |
[releng] |
21:19 |
<dwisehaupt> |
ran the following on frdb2002 to allow replication to continue after conversion to utf8mb4 charset: set global slave_type_conversions = ALL_NON_LOSSY; |
[production] |
18:44 |
<gehel> |
depooled wdqs1011 to catch up on lag |
[production] |
18:37 |
<gehel> |
powercycling wdqs1011 |
[production] |
02:23 |
<bstorm> |
deployed typo fix to maintain-kubeusers in an innocent effort to make the weekend better T275910 |
[tools] |
02:00 |
<bstorm> |
running a script to repair the dumps mount in all podpresets T275371 |
[tools] |
00:33 |
<andrewbogott> |
sudo cumin --timeout 500 "A:all and not O{project:clouddb-services}" 'lsb_release -c | grep -i buster && uname -r | grep -v 4.19.0-14-amd64 && reboot' |
[admin] |
00:28 |
<andrewbogott> |
sudo cumin --timeout 500 "A:all and not O{project:clouddb-services}" 'lsb_release -c | grep -i buster && uname -r | grep -v 4.19.0-14-amd64 && echo reboot' |
[admin] |
00:09 |
<andrewbogott> |
sudo cumin "A:all and not O{project:clouddb-services}" 'lsb_release -c | grep -i stretch && uname -r | grep -v 4.19.0-0.bpo.14-amd64 && reboot' |
[admin] |
00:08 |
<mutante> |
deploy1002 - rsyncing home dirs from deploy1001 |
[production] |
2021-02-26
§
|
23:20 |
<bstorm> |
rebooting clouddb-wikilabels-02 for patches |
[clouddb-services] |
22:55 |
<bstorm> |
rebooting clouddb-wikireplicas-proxy-1 and clouddb-wikireplicas-proxy-2 before (hopefully) many people are using them |
[clouddb-services] |
22:04 |
<bstorm> |
cleaned up grid jobs 1230666,1908277,1908299,2441500,2441513 |
[tools] |
21:40 |
<Majavah> |
restated stuck job stewardbot, sulwatcher seems to be doing fine |
[tools.stewardbots] |
21:27 |
<bstorm> |
hard rebooting tools-sgeexec-0947 |
[tools] |
21:21 |
<bstorm> |
hard rebooting tools-sgeexec-0952.tools.eqiad.wmflabs |
[tools] |
20:46 |
<andrewbogott> |
rebooting all hosts |
[cloudinfra] |
20:39 |
<andrewbogott> |
rebooting all hosts |
[toolsbeta] |
20:29 |
<mutante> |
deploy2001 - /srv/mediawiki-staging sudo find . -name *.cdb delete - deleted 190 GB of old cdb files (T275826 T265963) |
[production] |
20:01 |
<bd808> |
Deleted csr in strange state for tool-ores-inspect |
[tools] |
19:47 |
<James_F> |
Zuul: [mediawiki/services/geoshapes] Add typescript service CI T274380 |
[releng] |
18:31 |
<dwisehaupt> |
starting the utf8mb4 table alters on frdb2002 under a root screen session |
[production] |
17:59 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mwmaint2002.codfw.wmnet with reason: REIMAGE |
[production] |
17:57 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mwmaint2002.codfw.wmnet with reason: REIMAGE |
[production] |
16:03 |
<razzi> |
rebalance kafka partitions for webrequest_upload partition 4 |
[analytics] |
15:35 |
<dcaro> |
removed toolsbeta-test-k8s-etcd-9 with depool from kubeadmin/etcd (T274497) |
[toolsbeta] |
15:04 |
<dcaro@cumin1001> |
END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) |
[production] |
14:59 |
<dcaro@cumin1001> |
START - Cookbook sre.hosts.upgrade-and-reboot |
[production] |
14:58 |
<dcaro> |
[eqiad] rebooting cloudcephosd1015 (last osd \o/) for kernel upgrade (T275753) |
[admin] |
14:57 |
<dcaro@cumin1001> |
END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) |
[production] |
14:51 |
<dcaro@cumin1001> |
START - Cookbook sre.hosts.upgrade-and-reboot |
[production] |
14:51 |
<dcaro> |
[eqiad] rebooting cloudcephosd1014 for kernel upgrade (T275753) |
[admin] |
14:49 |
<dcaro@cumin1001> |
END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) |
[production] |
14:44 |
<dcaro@cumin1001> |
START - Cookbook sre.hosts.upgrade-and-reboot |
[production] |
14:44 |
<dcaro> |
[eqiad] rebooting cloudcephosd1013 for kernel upgrade (T275753) |
[admin] |
14:43 |
<dcaro@cumin1001> |
END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0) |
[production] |
14:38 |
<dcaro@cumin1001> |
START - Cookbook sre.hosts.upgrade-and-reboot |
[production] |