production SAL

2201-2250 of 10000 results (40ms)

2016-02-08 §
14:41	<bblack>	mobile LVS service decom complete (IPs now belong to text service)	[production]
14:03	<bblack>	starting mobile LVS service decom (IPs moving to text) - puppet disabled on text caches and high-traffic1 LVSes	[production]
13:56	<bblack>	cpNNNN rolling reboots paused (3038 still coming up)	[production]
13:12	<bblack>	start up more rolling cache reboots for kernels (cpNNNN)	[production]
13:09	<elukey>	updated hhvm on mw2016.codfw.wmnet, mw2161.codfw.wmnet, mw2199.codfw.wmnet, mw1259.eqiad.wmnet, mw1260.eqiad.wmnet	[production]
13:05	<_joe_>	roll back installation of pybal, issues with upd and ipv6	[production]
12:56	<elukey>	updated hhvm on mw1080, mv1084, mw1241	[production]
12:32	<elukey>	restarting hhvm on mw1052, mw1075, mw1080, mw1081, mw1094, mw1095 to rollout the new version	[production]
12:32	<_joe_>	uploaded a new pybal package; installing on codfw and ulsfo backups	[production]
12:05	<_joe_>	restarted cron on tin, to catch up with the uid change for the l10nupdate user	[production]
11:53	<bblack>	rebooting cp1074, cp3047 (for kernels, also to compare bios/drac settings...)	[production]
11:26	<jynus>	stopping mysql at db2012	[production]
11:25	<jynus>	starting mysql at db2012	[production]
11:05	<moritzm>	rebooting db2012 for kernel update	[production]
11:00	<moritzm>	rebooting terbium for kernel update	[production]
10:26	<moritzm>	rebooting es2006,es2008 for kernel update	[production]
10:25	<moritzm>	upgrading jobrunners/imagescalers in eqiad for hhvm float timeout fix	[production]
10:20	<jynus>	changing s2 replication topology in preparation for master failover	[production]
09:45	<jynus>	starting es2004	[production]
09:29	<moritzm>	rebooting es2005,es2007,es2009,es2010 for kernel update	[production]
09:15	<elukey>	hhvm restarted on mw1044.eqiad.wmnet due to hhvm package update	[production]
09:15	<l10nupdate@tin>	ResourceLoader cache refresh completed at Mon Feb 8 09:15:11 UTC 2016 (duration 8m 10s)	[production]
09:12	<elukey>	hhvm restarted on mw1034.eqiad.wmnet due to hhvm package update	[production]
09:07	<oblivian@tin>	sync-l10n completed (1.27.0-wmf.12) (duration: 11m 55s)	[production]
08:42	<_joe_>	trying a manual run of l10nupdate since it failed last night again	[production]
08:25	<moritzm>	rebooting es2001 to es2004 for kernel update	[production]
2016-02-07 §
04:53	<andrewbogott>	upgraded python-openstackclient python-glanceclient python-novaclient python-keystoneclient on silver	[production]
2016-02-06 §
05:43	<bblack>	rebooted cp2006 via racadm after crash - no crash data in logs...	[production]
2016-02-05 §
23:54	<chasemp>	nfs shaping is really writes :)	[production]
23:54	<chasemp>	tc to shape some nfs read traffic in tools for labs (also logged there) can be cancelled with: /sbin/tc qdisc del dev eth0 root	[production]
23:51	<YuviPanda>	dropped old nfs snapshots from labstore1001	[production]
23:30	<maxsem@mira>	Synchronized portals: (no message) (duration: 01m 18s)	[production]
23:29	<maxsem@mira>	Synchronized portals/prod/wikipedia.org/assets: (no message) (duration: 01m 19s)	[production]
22:56	<jynus>	reimaging db1018	[production]
22:48	<jynus>	restarting slave on m2/codfw (db2011)	[production]
22:41	<krenair@mira>	Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/268818/ (duration: 01m 22s)	[production]
22:10	<bblack>	cache rolling reboots stopped for the weekend, can pick up the other half monday	[production]
20:36	<bblack>	resuming rolling cache reboots	[production]
20:07	<mutante>	cygnus - reboot VM	[production]
19:28	<bblack>	halted rolling cache reboots, we seem to be having problems with a batch of them coming back...	[production]
18:23	<demon@mira>	Synchronized wmf-config/InitialiseSettings.php: comment stuff, gerrit 267994 (duration: 01m 19s)	[production]
18:15	<jynus>	stopping mysql@db1018 and starting to clone it for reimaging	[production]
18:10	<jynus@mira>	Synchronized wmf-config/db-eqiad.php: Depool db1018 for maintenance (duration: 02m 12s)	[production]
17:31	<cmjohnson1>	trouble shooting elastic1021	[production]
17:07	<bblack>	rolling cpNNNN reboots are 27% complete, only two hosts so far failed to reboot on their own (but came up fine after manual racadm powercycle)	[production]
16:20	<ottomata>	reenabling kafka1012 in analytics-eqiad kafka cluster	[production]
16:03	<jynus>	reimaging db2030 to test jessie installer	[production]
15:53	<oblivian@tin>	sync-l10n completed (1.27.0-wmf.12) (duration: 00m 08s)	[production]
15:47	<urandom>	performing rolling restbase restart in staging env	[production]
15:38	<_joe_>	launched l10update cronjob manually, was not running since tin's reimaging	[production]