production SAL

3501-3550 of 10000 results (56ms)

2021-12-11 §
19:03	<andrew@cumin1001>	END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1028.eqiad.wmnet with OS buster	[production]
00:04	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
00:00	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
2021-12-10 §
22:39	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
22:33	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
22:12	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
22:11	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
22:09	<dancy@deploy1002>	rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.9 refs T293953	[production]
21:10	<rzl>	sudo cumin -b7 -s10 -p0 'A:mw-eqiad and not P{mw1414.eqiad.wmnet}' restart-php7.2-fpm	[production]
21:09	<rzl>	rzl@mw1414:~$ sudo depool - preserving for investigation, T297517	[production]
20:43	<rzl>	sudo cumin -b2 -s10 -p0 'A:parsoid and not P{wtp1025.eqiad.wmnet}' restart-php7.2-fpm - T297517	[production]
20:38	<rzl>	rzl@wtp1025:~$ sudo restart-php7.2-fpm - T297517 - rolling restart to follow	[production]
18:50	<jhathaway@cumin1001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts copernicium.wikimedia.org	[production]
18:11	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti2017.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage	[production]
18:11	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti2017.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage	[production]
18:04	<jhathaway@cumin1001>	START - Cookbook sre.hosts.decommission for hosts copernicium.wikimedia.org	[production]
17:21	<dancy@deploy1002>	Synchronized php-1.38.0-wmf.12/extensions/Cite/modules/ve-cite/ve.ui.MWReferencesListDialog.js: Backport: [[gerrit:745872\|ve.ui.MWReferencesListDialog: Fix exception caused by a copy-paste mistake (T297418)]] (duration: 00m 58s)	[production]
17:17	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
17:16	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
16:59	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
16:58	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
16:56	<dancy@deploy1002>	Synchronized php-1.38.0-wmf.12/extensions/DiscussionTools/includes/Notifications/EventDispatcher.php: Backport: [[gerrit:745652\|Fix PageRecord lookup (T297431)]] (duration: 00m 58s)	[production]
16:56	<jynus>	increase backup2007's allocated disk space	[production]
16:43	<dancy@deploy1002>	Synchronized php-1.38.0-wmf.12/extensions/DiscussionTools/includes/Notifications/EventDispatcher.php: Backport: [[gerrit:745652\|Fix PageRecord lookup (T297431)]] (duration: 00m 58s)	[production]
16:06	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
16:05	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
15:54	<jynus>	increase backup2006's allocated disk space	[production]
15:24	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1174 (T277354)', diff saved to https://phabricator.wikimedia.org/P18112 and previous config saved to /var/cache/conftool/dbconfig/20211210-152410-marostegui.json	[production]
15:19	<jelto@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' .	[production]
15:16	<jelto@deploy1002>	helmfile [eqiad] DONE helmfile.d/admin 'apply'.	[production]
15:15	<jelto@deploy1002>	helmfile [eqiad] START helmfile.d/admin 'apply'.	[production]
15:15	<jelto>	remove tiller from eqiad Kubernetes cluster	[production]
15:09	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P18111 and previous config saved to /var/cache/conftool/dbconfig/20211210-150906-marostegui.json	[production]
15:08	<jelto@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .	[production]
15:06	<jynus>	increase backup2005's allocated disk space	[production]
15:01	<jelto@deploy1002>	helmfile [codfw] DONE helmfile.d/admin 'apply'.	[production]
15:01	<jelto>	remove tiller from codfw Kubernetes cluster	[production]
15:01	<jelto@deploy1002>	helmfile [codfw] START helmfile.d/admin 'apply'.	[production]
14:55	<moritzm>	drain primary/secondary instances off ganeti2017 T296622	[production]
14:54	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P18110 and previous config saved to /var/cache/conftool/dbconfig/20211210-145401-marostegui.json	[production]
14:50	<jelto@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .	[production]
14:49	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on restbase2026.codfw.wmnet with reason: New cassandra hosts awaiting syncing	[production]
14:49	<hnowlan@cumin1001>	START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on restbase2026.codfw.wmnet with reason: New cassandra hosts awaiting syncing	[production]
14:48	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on ganeti2008.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage	[production]
14:48	<jmm@cumin2002>	START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on ganeti2008.codfw.wmnet with reason: Temporarily remove node from Ganeti for reimage	[production]
14:48	<jelto@deploy1002>	helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.	[production]
14:48	<jelto@deploy1002>	helmfile [staging-eqiad] START helmfile.d/admin 'apply'.	[production]
14:48	<jelto>	remove tiller from staging-eqiad Kubernetes cluster	[production]
14:38	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1174 (T277354)', diff saved to https://phabricator.wikimedia.org/P18109 and previous config saved to /var/cache/conftool/dbconfig/20211210-143856-marostegui.json	[production]
14:36	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depooling db1174 (T277354)', diff saved to https://phabricator.wikimedia.org/P18108 and previous config saved to /var/cache/conftool/dbconfig/20211210-143636-marostegui.json	[production]