production SAL

3051-3100 of 10000 results (66ms)

2019-08-13 §
15:42	<gehel@cumin2001>	END (FAIL) - Cookbook sre.elasticsearch.rolling-reboot (exit_code=99)	[production]
15:39	<bblack>	puppet re-enabled on lvs1014, lvs1016, icinga1001	[production]
15:35	<XioNoX>	depool eqsin for cr2-eqsin upgrade	[production]
15:32	<bblack>	disabled pupped on lvs1014, lvs1016, icinga1001 ahead of deploying https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/528885/ - T229621	[production]
15:32	<gehel@cumin2001>	START - Cookbook sre.elasticsearch.rolling-reboot	[production]
15:30	<XioNoX>	rollback ospf + bgp changes on cr2-eqord	[production]
15:19	<XioNoX>	restart cr2-eqord - T227886	[production]
15:12	<XioNoX>	disable all peering and transit on cr2-eqord	[production]
15:01	<XioNoX>	increase ospf cost of cr2-eqord<->cr2-eqiad link (+1000)	[production]
14:57	<ema>	cp5002: reboot for kernel upgrade	[production]
14:42	<gehel@cumin2001>	END (FAIL) - Cookbook sre.elasticsearch.rolling-reboot (exit_code=99)	[production]
14:42	<gehel@cumin2001>	START - Cookbook sre.elasticsearch.rolling-reboot	[production]
14:31	<gehel@cumin2001>	END (FAIL) - Cookbook sre.elasticsearch.rolling-reboot (exit_code=99)	[production]
14:31	<gehel@cumin2001>	START - Cookbook sre.elasticsearch.rolling-reboot	[production]
14:29	<XioNoX>	rollback: disable all peering and transit on cr2-eqdfw	[production]
14:18	<XioNoX>	reboot cr2-eqdfw for software upgrade - T227886	[production]
14:14	<XioNoX>	disable all peering and transit on cr2-eqdfw	[production]
14:04	<volans@cumin2001>	END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)	[production]
14:04	<volans@cumin2001>	START - Cookbook sre.hosts.decommission	[production]
13:20	<jbond42>	rolling update of postgresql-9.6	[production]
13:07	<jijiki>	rolling restart hhvm on api servers in eqiad	[production]
12:57	<jijiki>	Restart hhvm on mw1235	[production]
12:17	<fsero@puppetmaster1001>	conftool action : set/pooled=true; selector: dnsdisc=sessionstore\|citoid\|cxserver\|eventgate-analytics\|eventgate-main\|termbox\|blubberoid\|mathoid\|zotero,name=eqiad	[production]
12:08	<_joe_>	restarted php-fpm on mw1221	[production]
12:03	<fsero@>	helmfile [EQIAD] Ran 'apply' command on namespace 'sessionstore' for release 'production' .	[production]
12:00	<fsero@>	helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' .	[production]
11:56	<fsero@>	helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .	[production]
11:56	<fsero@>	helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .	[production]
11:49	<fsero@>	helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' .	[production]
11:44	<fsero>	recreating cxserver blubber and sessionstore namespace - T228836	[production]
11:39	<fsero@>	helmfile [EQIAD] Ran 'apply' command on namespace 'mathoid' for release 'production' .	[production]
11:35	<gehel>	restart wdqs-blazegraph on wdqs2001	[production]
11:34	<gehel>	restart wdqs-updater on wdqs2001	[production]
11:30	<fsero@>	helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-main' for release 'main' .	[production]
11:29	<fsero@>	helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .	[production]
11:25	<fsero@>	helmfile [EQIAD] Ran 'apply' command on namespace 'citoid' for release 'production' .	[production]
11:21	<fsero>	recreating citoid eventgate-analytics eventgate-main mathoid namespace - T228836	[production]
11:20	<fsero@>	helmfile [EQIAD] Ran 'apply' command on namespace 'termbox' for release 'production' .	[production]
11:18	<raynor>	EU SWAT finished	[production]
11:15	<pmiazga@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:529925\|Undeploy editor gender surveys (T227793)]] (duration: 00m 48s)	[production]
11:13	<fsero>	recreating termbox namespace - T228836	[production]
11:06	<oblivian@>	helmfile [EQIAD] Ran 'apply' command on namespace 'zotero' for release 'production' .	[production]
11:04	<fsero>	resetting net.netfilter.nf_conntrack_tcp_timeout_time_wait to 65 in kubernetes2006	[production]
10:59	<_joe_>	[eqiad] downtiming zotero on icinga for 10 minutes while recreating the deployment with helmfile	[production]
10:57	<oblivian@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
10:57	<oblivian@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
10:56	<oblivian@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
10:56	<oblivian@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
10:49	<oblivian@>	helmfile [EQIAD] Ran 'apply' command on namespace 'kube-system' for release 'coredns' .	[production]
10:44	<oblivian@>	helmfile [EQIAD] Ran 'apply' command on namespace 'kube-system' for release 'coredns' .	[production]