production SAL

6201-6250 of 10000 results (20ms)

2021-09-24 §
07:17	<elukey@cumin1001>	END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001	[production]
07:01	<elukey@cumin1001>	START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001	[production]
07:01	<elukey@cumin1001>	END (ERROR) - Cookbook sre.hadoop.roll-restart-workers (exit_code=97) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001	[production]
07:00	<elukey@cumin1001>	START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001	[production]
06:55	<elukey@cumin1001>	START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001	[production]
06:53	<elukey@cumin1001>	END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid test cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001	[production]
06:44	<elukey@cumin1001>	START - Cookbook sre.druid.roll-restart-workers for Druid test cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001	[production]
06:41	<elukey@cumin1001>	END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons. - elukey@cumin1001	[production]
06:30	<elukey@cumin1001>	START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons. - elukey@cumin1001	[production]
06:26	<elukey>	restart archiva on archiva1002 to pick up new openjdk upgrades	[production]
2021-09-23 §
16:13	<elukey>	reboot an-worker1096 to see if megacli status for a new disk changes - T290805	[production]
15:09	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.	[production]
15:09	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.	[production]
15:06	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.	[production]
15:06	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.	[production]
14:19	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.	[production]
14:19	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.	[production]
13:09	<elukey>	update pcc facts (after change in puppetdb's fact filter list, to allow partitions for analytics)	[production]
07:01	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
07:01	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
06:59	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
06:59	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
06:57	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.	[production]
06:57	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.	[production]
06:55	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.	[production]
06:55	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.	[production]
06:55	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
06:55	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
2021-09-22 §
06:02	<elukey>	update pcc facts	[production]
2021-09-21 §
17:39	<elukey>	update pcc facts	[production]
15:39	<elukey>	update pcc facts	[production]
11:55	<elukey@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
11:46	<elukey@cumin1001>	START - Cookbook sre.dns.netbox	[production]
2021-09-20 §
13:39	<elukey@cumin1001>	END (PASS) - Cookbook sre.ores.roll-restart-workers (exit_code=0) for ORES codfw cluster: Roll restart of ORES's daemons. - elukey@cumin1001	[production]
13:20	<elukey@cumin1001>	START - Cookbook sre.ores.roll-restart-workers for ORES codfw cluster: Roll restart of ORES's daemons. - elukey@cumin1001	[production]
2021-09-16 §
07:48	<elukey@puppetmaster1001>	conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=eqiad	[production]
2021-09-15 §
06:57	<elukey>	shutdown ms-be2045 (again) after seeing T290881	[production]
06:02	<elukey>	powercycle ms-be2045 - no ssh, no remote tty available	[production]
2021-09-13 §
09:18	<elukey>	upgrade rsyslog* on ml-serve* nodes to 8.1901.0-1+wmf2	[production]
09:11	<elukey>	upload rsyslog* 8.1901.0-1+wmf2 to buster-wikimedia component/rsyslog-k8s - T277739	[production]
2021-09-10 §
08:14	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
08:14	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
08:14	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
08:13	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
08:12	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
08:12	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
07:31	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
07:31	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
06:02	<elukey@puppetmaster1001>	conftool action : set/pooled=inactive; selector: name=mw2280.codfw.wmnet	[production]
05:56	<elukey>	powercycle mw2280 - no tty available in mgmt, no ssh, host frozen	[production]