production SAL

4601-4650 of 10000 results (75ms)

2019-10-25 §
01:28	<bblack@cumin1001>	conftool action : set/pooled=no; selector: name=cp3041.esams.wmnet	[production]
01:27	<bblack@cumin1001>	conftool action : set/pooled=yes; selector: name=cp3061.esams.wmnet	[production]
01:27	<bblack@cumin1001>	conftool action : set/pooled=no; selector: name=cp3046.esams.wmnet	[production]
01:27	<bblack@cumin1001>	conftool action : set/pooled=no; selector: name=cp3045.esams.wmnet	[production]
01:13	<mutante>	puppetmaster1001 - revoking parsoid.svc.eqiad / parsoid.svc.codfw / parsoid.discovery.wmnet certificates and creating new ones including parsoid-php.discovery.wmnet (T233654)	[production]
00:52	<krinkle@deploy1001>	Synchronized php-1.35.0-wmf.3/extensions/LiquidThreads/classes/View.php: (no justification provided) (duration: 00m 54s)	[production]
2019-10-24 §
23:46	<mutante>	bast3002 - rsyncing /home, /srv/tfptboot and /srv/prometheus to /srv/bast3002/ on bast3004 (T236394 T236329)	[production]
23:24	<krinkle@deploy1001>	Synchronized php-1.35.0-wmf.3/includes/specials/pagers/BlockListPager.php: T236425, fc99c5a7c0de2 (duration: 00m 54s)	[production]
22:16	<bblack@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
22:14	<bblack@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
22:13	<mutante>	gerrit1001 - starting gerrit	[production]
22:13	<bblack@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
22:12	<bblack@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
22:12	<bblack@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
22:12	<bblack@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
22:11	<bblack@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
22:10	<thcipriani>	stopping gerrit briefly for script run for T236344	[production]
22:09	<bblack@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
22:01	<mutante>	mw1270 - was alerting in Icinga as degraded systemd state - reason was 'hhvm.service not-found". systemctl reset-failed cleared it. could cause monitoring spam on more servers (T229792)	[production]
21:56	<eileen>	civicrm revision changed from 47e0800001 to a55c2d2787, config revision is 63a67f32a1	[production]
21:16	<bblack@cumin1001>	conftool action : set/pooled=no; selector: name=cp3040.esams.wmnet	[production]
21:16	<bblack@cumin1001>	conftool action : set/pooled=yes; selector: name=cp3050.esams.wmnet	[production]
21:13	<bblack@cumin1001>	conftool action : set/pooled=yes; selector: name=cp3051.esams.wmnet	[production]
21:13	<bblack@cumin1001>	conftool action : set/pooled=no; selector: name=cp3044.esams.wmnet	[production]
21:12	<bblack@cumin1001>	conftool action : set/pooled=no; selector: name=cp3039.esams.wmnet	[production]
21:06	<bblack>	cr3-esams remove pybal neighbor IPs for lvs3001-4	[production]
21:05	<bblack>	cr2-esams remove pybal neighbor IPs for lvs3001-4	[production]
21:05	<urandom>	restbase cassandra rolling restart, codfw / rack 'd' -- T200803	[production]
21:02	<bblack>	downtimed lvs3001-4, stopping pybal there, etc...	[production]
20:58	<bblack>	cr3-esams switch high-traffic1 static fallback routes from lvs3001 to lvs3005	[production]
20:58	<bblack>	cr2-esams switch high-traffic1 static fallback routes from lvs3001 to lvs3005	[production]
20:40	<bblack>	esams lvs: high-traffic1 - change 3005's med to 0 (becomes new primary, permanently)	[production]
20:36	<bblack>	esams lvs: high-traffic1 - change 3003's med to 200, 3001's med to 50, 3005 remains 100 (traffic will blip to 3005 then back to 3001 again)	[production]
20:33	<urandom>	restbase cassandra rolling restart, codfw / rack 'c' -- T200803	[production]
20:24	<bblack@cumin1001>	conftool action : set/pooled=no; selector: name=cp3038.esams.wmnet	[production]
20:24	<bblack@cumin1001>	conftool action : set/pooled=no; selector: name=cp3033.esams.wmnet	[production]
20:23	<bblack@cumin1001>	conftool action : set/pooled=yes; selector: name=cp3053.esams.wmnet	[production]
20:22	<bblack@cumin1001>	conftool action : set/pooled=yes; selector: name=cp3054.esams.wmnet	[production]
20:04	<bblack>	reboot cp3054 again for good measure	[production]
19:57	<bblack>	cp3054 - trying racadm serveraction hardreset	[production]
19:32	<bblack>	reboot dns3001	[production]
19:31	<urandom>	restbase cassandra rolling restart, codfw / rack 'b' -- T200803	[production]
19:10	<dzahn@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
19:07	<dzahn@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
19:05	<urandom>	restbase cassandra rolling restart, rack 'd' -- T200803	[production]
19:05	<dzahn@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
19:05	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
19:05	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
19:03	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
19:01	<bblack@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]