production SAL

1701-1750 of 10000 results (31ms)

2020-08-06 §
06:37	<elukey>	roll restart of druid clusters' zookeeper and an-conf* zookeeper for openjdk-11 upgrades	[production]
06:36	<elukey@cumin1001>	START - Cookbook sre.zookeeper.roll-restart-zookeeper	[production]
06:22	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
06:20	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
05:07	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1127 for MCR', diff saved to https://phabricator.wikimedia.org/P12184 and previous config saved to /var/cache/conftool/dbconfig/20200806-050743-marostegui.json	[production]
04:56	<marostegui@cumin1001>	dbctl commit (dc=all): 'Fully repool db1079', diff saved to https://phabricator.wikimedia.org/P12182 and previous config saved to /var/cache/conftool/dbconfig/20200806-045622-marostegui.json	[production]
04:51	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1079', diff saved to https://phabricator.wikimedia.org/P12181 and previous config saved to /var/cache/conftool/dbconfig/20200806-045107-marostegui.json	[production]
04:46	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1079', diff saved to https://phabricator.wikimedia.org/P12180 and previous config saved to /var/cache/conftool/dbconfig/20200806-044608-marostegui.json	[production]
04:37	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1079', diff saved to https://phabricator.wikimedia.org/P12179 and previous config saved to /var/cache/conftool/dbconfig/20200806-043758-marostegui.json	[production]
03:04	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=wtp2019.codfw.wmnet	[production]
02:24	<eileen>	process-control config revision is 525eb71235 turn off delete deleted contacts	[production]
01:52	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
01:52	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
01:19	<dzahn@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
01:19	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
01:17	<dzahn@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
01:17	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
00:35	<mutante>	wtp2019 - reimaging - parsoid service does not work, unlike on all other wtp*, making sure it's clean	[production]
00:00	<mutante>	LDAP - removed demon from nda group	[production]
2020-08-05 §
23:57	<eileen>	civicrm revision changed from 150c3476c4 to 72452e28a9, config revision is b6ece03513	[production]
23:02	<shdubsh>	logstash in codfw looks stuck -- restarting	[production]
19:41	<brennen@deploy1001>	rebuilt and synchronized wikiversions files: Revert group1 wikis to 1.36.0-wmf.2	[production]
19:39	<pt1979@cumin2001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
19:37	<pt1979@cumin2001>	START - Cookbook sre.hosts.downtime	[production]
19:13	<brennen@deploy1001>	Synchronized php: group1 wikis to 1.36.0-wmf.3 (duration: 01m 44s)	[production]
19:11	<brennen@deploy1001>	rebuilt and synchronized wikiversions files: group1 wikis to 1.36.0-wmf.3	[production]
18:26	<Lucas_WMDE>	Morning backport window done	[production]
18:25	<lucaswerkmeister-wmde@deploy1001>	Synchronized php-1.36.0-wmf.3/extensions/ContentTranslation/: Backport: [[gerrit:618566\|Pass jQuery objects into jqueryMsg]] (duration: 01m 11s)	[production]
18:14	<mutante>	test !log	[production]
18:10	<lucaswerkmeister-wmde@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:618343\|Re-enable growth study quick survey (T257015)]] (duration: 01m 12s)	[production]
17:30	<shdubsh>	test prometheus-icinga-exporter upgrade on icinga2001	[production]
16:50	<elukey>	powercycle stat1005 after GPU issue	[production]
15:56	<otto@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: EventStreamConfig - Add eventgate-logging-external streams and destination_event_service settings - T251935 (duration: 01m 05s)	[production]
15:50	<hnowlan@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .	[production]
15:43	<hnowlan@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .	[production]
15:11	<pt1979@cumin2001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
15:08	<godog>	bounce logstash on logstash100[789] - udp loss reported	[production]
15:05	<pt1979@cumin2001>	START - Cookbook sre.dns.netbox	[production]
14:48	<elukey>	reboot stat1008 for unexpected maintenance (GPU stuck)	[production]
14:33	<otto@deploy1001>	helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .	[production]
14:32	<otto@deploy1001>	helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .	[production]
14:27	<otto@deploy1001>	helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .	[production]
14:27	<otto@deploy1001>	helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .	[production]
14:25	<moritzm>	installing nmap bugfix updates from buster point release	[production]
14:24	<otto@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .	[production]
14:24	<otto@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .	[production]
14:20	<sukhe@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
14:20	<sukhe@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
14:14	<moritzm>	installing pillow security updates	[production]
14:03	<moritzm>	installing node-minimist security updates	[production]