production SAL

6951-7000 of 10000 results (70ms)

2018-04-16 §
08:41	<moritzm>	pooled mw1261-mw1264 (app server canaries running stretch)	[production]
08:29	<joal@tin>	Finished deploy [analytics/refinery@27416a9]: Regular weekly deploy - Mostly bugfixes from previous week huge deploy (duration: 05m 27s)	[production]
08:25	<_joe_>	depooling mw1223 for investigation too	[production]
08:23	<joal@tin>	Started deploy [analytics/refinery@27416a9]: Regular weekly deploy - Mostly bugfixes from previous week huge deploy	[production]
08:17	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 in API (duration: 00m 58s)	[production]
08:04	<elukey>	restart hhvm on mw[1228,1234,1281-1287,1289,1290,1312-1314,1317,1339,1343,1345,1346,1348] - more than 50% cpu usage, prevention scheme for current high load	[production]
08:00	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Slowly repool db1114 (duration: 00m 58s)	[production]
07:49	<marostegui>	Stop MySQL and reboot db1114 - T191996	[production]
07:46	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Depool db1114 (duration: 00m 59s)	[production]
07:40	<vgutierrez@neodymium>	conftool action : set/pooled=no; selector: name=achernar.wikimedia.org,service=pdns_recursor	[production]
07:39	<vgutierrez>	Depool and reimage achernar.wikimedia.org - T187090	[production]
07:27	<moritzm>	installing perl security updates on Debian systems	[production]
06:45	<TimStarling>	depooled mw1230	[production]
06:38	<_joe_>	repooling mw1230	[production]
06:20	<marostegui>	Drop table flow_subscription from x1 - T149936	[production]
05:59	<elukey>	restart hhvm on mw[1221,1233,1280,1347] - high load	[production]
05:55	<elukey>	repool mw1341 after investigation	[production]
05:48	<elukey>	restart hhvm on mw1225, 1315, 1316, 1340, 1341, 1342, 1347 - high load	[production]
05:42	<marostegui>	Reload haproxy on dbproxy1010	[production]
05:36	<elukey>	restart hhvm on mw1226,27,32,88 - high load	[production]
05:35	<_joe_>	depooling mw1341 to further debug the API issue	[production]
05:33	<marostegui>	Deploy schema change on db1087 with replication (this will generate lag in labs) - T187089 T185128 T153182	[production]
05:31	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Depool db1087 (duration: 00m 59s)	[production]
03:02	<l10nupdate@tin>	scap sync-l10n completed (1.31.0-wmf.29) (duration: 11m 09s)	[production]
2018-04-15 §
22:09	<ema>	cp3037: restart varnish-be	[production]
21:45	<ema>	cp3039: restart varnish-be	[production]
21:42	<elukey>	restart hhvm on mw1286,1317,1339 - high load	[production]
21:31	<ema>	cp3038: restart varnish-be	[production]
21:30	<ema>	cp3036: restart varnish-be	[production]
20:52	<elukey>	restart hhvm on mw13[43,45,46,48] - high load	[production]
20:48	<elukey>	restart hhvm on mw13[12-14] - high load	[production]
20:45	<elukey>	restart hhvm on mw[1285,1287,1289-1290] - high load	[production]
20:40	<_joe_>	restart mw1344, high load	[production]
20:38	<elukey>	restart hhvm on mw12[22,79,82] - high load	[production]
20:32	<elukey>	restart hhvm on mw12[32-35] - high load	[production]
20:24	<elukey>	restart hhvm on mw1229-31 - high load	[production]
20:24	<_joe_>	restarted mw1280-4, high load	[production]
20:17	<elukey>	restart hhvm on mw122[6-8] - high load	[production]
20:05	<elukey>	restart hhvm on mw122[3,4] - high load	[production]
13:42	<elukey>	restart hhvm on mw1227 due to high load (hhvm dump debug in /tmp/hhvm.44071.bt)	[production]
10:53	<elukey>	powercycle mw1272 - not responsive to ssh, mgmt com2 console showing "[OK" and no tty	[production]
2018-04-13 §
20:44	<imarlier@tin>	Finished deploy [performance/navtiming@8b6ab4e]: initial attempt to deploy navtiming via scap (will not be active) (duration: 00m 02s)	[production]
20:44	<imarlier@tin>	Started deploy [performance/navtiming@8b6ab4e]: initial attempt to deploy navtiming via scap (will not be active)	[production]
20:00	<demon@tin>	Pruned MediaWiki: 1.31.0-wmf.28 [keeping static files] (duration: 01m 34s)	[production]
19:23	<demon@tin>	Pruned MediaWiki: 1.31.0-wmf.25 (duration: 05m 03s)	[production]
17:17	<andrewbogott>	upgraded packages on all labvirts and restarted nova-compute	[production]
16:55	<arturo>	enable puppet in labstore1005	[production]
16:42	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Give db1104 origina main traffic weight (duration: 01m 00s)	[production]
16:34	<andrewbogott>	upgrading packages on labvirt1016 and rebooting (1016 is a spare server that won't affect VPS users)	[production]
16:26	<arturo>	disable puppet in labstore1005 to hot-test https://gerrit.wikimedia.org/r/#/c/426103/	[production]