__all__ SAL

9851-9900 of 10000 results (70ms)

2021-11-27 §
12:10	<elukey>	drop /var/tmp/core files from ores1009, root partition full	[production]
11:55	<elukey>	disable coredumps for ORES celery units (will cause a roll restart of all celeries) - T296563	[production]
11:46	<elukey>	drop ores coredumps from ores1008	[production]
09:56	<elukey>	powercycle analytics1071, soft lockup stacktraces in the tty	[analytics]
09:56	<elukey>	powercycle analytics1071, soft lockup stacktraces in the tty	[production]
09:51	<elukey>	move ores coredump files from /var/cache/tmp to /srv/coredumps on ores100[6,7,8] and ores2003 to free space on the root partition	[production]
2021-11-26 §
16:11	<arnoldokoth>	drain kubestage1002 node in prep for decommissioning	[production]
16:05	<arnoldokoth>	drain kubestage1001 node in prep for decommissioning	[production]
15:46	<elukey>	move /var/tmp/core/* to /srv/coredumps on ores1008 to free root space	[production]
14:30	<jelto@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' .	[production]
14:25	<jelto@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' .	[production]
14:21	<jelto@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .	[production]
14:15	<hashar>	deployment-prep: https://integration.wikimedia.org/ci/job/beta-update-databases-eqiad/ database updating job is broken since 6:20 UTC due to a segmentation fault \| T296539	[releng]
14:13	<wm-bot>	<chicocvenancio> Kick bridgebot to see if it stops duplication in telegram	[tools.bridgebot]
13:51	<Amir1>	running T286552 schema changes in the cloud	[mailman]
13:48	<jelto@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .	[production]
13:46	<jelto@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .	[production]
13:25	<akosiaris@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.	[production]
13:25	<akosiaris@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'apply'.	[production]
12:21	<vgutierrez>	restarting HAProxy on O:cache::upload_haproxy - T290005	[production]
11:41	<akosiaris>	T296303 cleanup weird state of calico-codfw cluster	[production]
11:41	<akosiaris@deploy1002>	helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.	[production]
11:41	<akosiaris@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'sync'.	[production]
11:39	<akosiaris@deploy1002>	helmfile [staging-codfw] START helmfile.d/admin 'sync'.	[production]
11:25	<vgutierrez>	restarting HAProxy on O:cache::(text\|upload)_haproxy - T290005	[production]
10:23	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repool after fixing users T296274', diff saved to https://phabricator.wikimedia.org/P17880 and previous config saved to /var/cache/conftool/dbconfig/20211126-102340-ladsgroup.json	[production]
10:17	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1111 (T296274)', diff saved to https://phabricator.wikimedia.org/P17879 and previous config saved to /var/cache/conftool/dbconfig/20211126-101714-ladsgroup.json	[production]
10:17	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1111.eqiad.wmnet with reason: Maintenance T296274	[production]
10:17	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on db1111.eqiad.wmnet with reason: Maintenance T296274	[production]
10:14	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repool after fixing users T296274', diff saved to https://phabricator.wikimedia.org/P17878 and previous config saved to /var/cache/conftool/dbconfig/20211126-101423-ladsgroup.json	[production]
10:05	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1177 (T296274)', diff saved to https://phabricator.wikimedia.org/P17877 and previous config saved to /var/cache/conftool/dbconfig/20211126-100547-ladsgroup.json	[production]
10:05	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1177.eqiad.wmnet with reason: Maintenance T296274	[production]
10:05	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on db1177.eqiad.wmnet with reason: Maintenance T296274	[production]
10:04	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance T296143	[production]
10:04	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance T296143	[production]
08:28	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17876 and previous config saved to /var/cache/conftool/dbconfig/20211126-082834-ladsgroup.json	[production]
08:13	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17875 and previous config saved to /var/cache/conftool/dbconfig/20211126-081329-ladsgroup.json	[production]
07:58	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17874 and previous config saved to /var/cache/conftool/dbconfig/20211126-075824-ladsgroup.json	[production]
07:43	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'After maintenance db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17873 and previous config saved to /var/cache/conftool/dbconfig/20211126-074320-ladsgroup.json	[production]
07:07	<majavah>	hard reboot deployment-mwmaint02	[releng]
06:28	<Amir1>	killing extensions/MachineVision/maintenance/fetchSuggestions.php in mwmaint	[production]
06:19	<Amir1>	killing lingering process from mwmaint to depooled db (db1160) that was depooled nine hours ago	[production]
2021-11-25 §
21:37	<chicocvenancio>	rollback singleuser to PR #96 T295257	[paws]
21:34	<wm-bot>	<lucaswerkmeister> deployed baef3a16f6 (l10n updates)	[tools.lexeme-forms]
21:15	<chicocvenancio>	deploy PR #110 changing singleuser to bump openrefine version T295257	[paws]
20:43	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Depooling db1160 (T296143)', diff saved to https://phabricator.wikimedia.org/P17872 and previous config saved to /var/cache/conftool/dbconfig/20211125-204357-ladsgroup.json	[production]
20:43	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance T296143	[production]
20:43	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance T296143	[production]
19:28	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance T296143	[production]
19:28	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance T296143	[production]