production SAL

251-300 of 10000 results (22ms)

2020-09-09 §
08:30	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
08:30	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
08:14	<oblivian@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .	[production]
07:41	<urbanecm@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: Disable DynamicPageList on ruwikinews (T262240) (duration: 01m 22s)	[production]
07:25	<elukey>	restart varnishkafka-webrequest on cp5010 and cp5012, delivery reports errors happening since yesterday's network outage	[production]
06:21	<XioNoX>	push new pfw policies - T262297	[production]
01:58	<eileen>	civicrm revision changed from 4e40a59d42 to cc1f7e6d13, config revision is 4845a229dc	[production]
2020-09-08 §
23:47	<eileen>	civicrm revision is 4e40a59d42, config revision is d26334fa36	[production]
23:25	<eileen>	civicrm revision changed from 5e7352e2c3 to 4e40a59d42, config revision is 3cf0913789	[production]
22:14	<pt1979@cumin2001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
22:12	<andrew@deploy1001>	Finished deploy [horizon/deploy@7d727eb]: very minor wmf-puppet-dashboard update (duration: 03m 35s)	[production]
22:08	<andrew@deploy1001>	Started deploy [horizon/deploy@7d727eb]: very minor wmf-puppet-dashboard update	[production]
22:02	<pt1979@cumin2001>	START - Cookbook sre.dns.netbox	[production]
21:57	<andrew@deploy1001>	Finished deploy [horizon/deploy@7a3221d]: refreshing to clobber local hacks (duration: 00m 13s)	[production]
21:57	<andrew@deploy1001>	Started deploy [horizon/deploy@7a3221d]: refreshing to clobber local hacks	[production]
19:19	<jhuneidi@deploy1001>	rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.8	[production]
19:12	<jhuneidi@deploy1001>	Finished scap: testwikis wikis to 1.36.0-wmf.8 (duration: 71m 45s)	[production]
18:22	<elukey>	rm /srv/prometheus/ops/targets/mjolnir_msearch_eqiad.yaml on prometheus100[3,4] as cleanup after https://gerrit.wikimedia.org/r/621988 - T260305	[production]
18:00	<jhuneidi@deploy1001>	Started scap: testwikis wikis to 1.36.0-wmf.8	[production]
17:58	<ryankemper@cumin1001>	START - Cookbook sre.wdqs.data-reload	[production]
17:57	<ryankemper@cumin1001>	END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)	[production]
17:54	<Amir1>	Deployed patch for T262240	[production]
17:53	<ryankemper@cumin1001>	START - Cookbook sre.wdqs.data-reload	[production]
17:23	<andrewbogott>	rebooting cloudvirt1033	[production]
17:03	<klausman>	attempted to add rock-dkms_3.3-19_all.deb to thirdparty/amd-rocm33 for use on analytics servers with GPUs	[production]
16:35	<otto@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: wgEventStreams: Set canary_events_enabled: true for eventgate test streams and eventlogging_Test - T251609 (duration: 00m 58s)	[production]
16:34	<herron>	increased elk5 logstash JVM heaps to 2g (to help decrease kafka-logging consumer lag)	[production]
16:12	<longma>	1.36.0-wmf.8 was branched at e81e81e91473cc8259c473165863aca8ecea2784 for T257976	[production]
16:03	<akosiaris@deploy1001>	helmfile [staging] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .	[production]
16:03	<akosiaris@deploy1001>	helmfile [eqiad] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .	[production]
16:02	<akosiaris@deploy1001>	helmfile [codfw] Ran 'sync' command on namespace 'kube-system' for release 'calico-policy-controller' .	[production]
15:34	<jayme@cumin1001>	conftool action : set/pooled=yes; selector: name=kubernetes1004.*	[production]
15:32	<jayme@cumin1001>	conftool action : set/pooled=yes; selector: service=kubesvc,name=kubernetes1013.*	[production]
15:30	<elukey>	roll restart of hadoop master daemons on an-master100[1,2] after the cookbook failed	[production]
15:26	<elukey@cumin1001>	END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99)	[production]
15:20	<_joe_>	restarted celery-ores-worker.service on ores1007	[production]
15:19	<_joe_>	restarted ferm on wdqs1011	[production]
15:18	<elukey@cumin1001>	START - Cookbook sre.hadoop.roll-restart-masters	[production]
15:16	<_joe_>	starting wdqs-updater on wdqs1005	[production]
15:15	<bblack@cumin1001>	conftool action : set/pooled=yes; selector: name=cp1090.eqiad.wmnet	[production]
15:14	<bblack@cumin1001>	conftool action : set/pooled=yes; selector: name=cp108[789].eqiad.wmnet	[production]
15:14	<bblack>	repool cp1087-90 (eqiad row D)	[production]
15:13	<herron>	rolling restart of elk5 logstashes	[production]
15:10	<marostegui>	Start mysql on db1106 after PDU maintenance is done	[production]
15:03	<jayme@cumin1001>	conftool action : set/pooled=inactive; selector: service=kubesvc,name=kubernetes1013.*	[production]
15:03	<jayme@cumin1001>	conftool action : set/pooled=inactive; selector: name=kubernetes1004.*	[production]
15:03	<XioNoX>	request virtual-chassis vc-port set pic-slot 1 member 4 port 0	[production]
15:03	<XioNoX>	request virtual-chassis vc-port set pic-slot 0 member 2 port 50	[production]
15:02	<XioNoX>	request virtual-chassis vc-port set pic-slot 1 member 1 port 1	[production]
14:53	<marostegui>	Reload dbproxy1016 to recover the alert	[production]