production SAL

4251-4300 of 10000 results (63ms)

2017-06-21 §
11:23	<akosiaris>	reboot ganeti1007 for insertion into ganeti cluster	[production]
11:14	<elukey>	reboot aqs1006 for kernel update	[production]
11:04	<moritzm>	rebooting mw1180-mw1188 for kernel update	[production]
11:02	<akosiaris>	starting up all instances on ganeti01.svc.codfw.wmnet	[production]
11:01	<godog>	reimage ms-be1018 / 1019 with stretch	[production]
10:58	<ema>	reboot lvs[2004-2006] (codfw secondaries) for kernel update	[production]
10:50	<akosiaris>	rebooting all ganeti200X nodes	[production]
10:47	<akosiaris>	shutdown all VMs on the ganeti01.svc.codfw.wmnet cluster	[production]
10:43	<elukey>	reboot analytics1001 (Hadoop master) for kernel update	[production]
10:35	<akosiaris>	rebooting the entire codfw ganeti cluster for kernel upgrades. Silenced hosts in icinga already. T167643	[production]
10:30	<moritzm>	rebooting bast4001 for kernel update	[production]
10:21	<ema>	reboot lvs[1001-1003] (eqiad primaries) for kernel update	[production]
10:17	<elukey>	running a script in tmux on rdb[12]003 called "check" to dump periodically LLEN enwiki:jobqueue:enqueue:l-unclaimed and stopped the one on rdb2004	[production]
10:07	<ema>	reboot lvs[1004-1006] (eqiad secondaries) for kernel update	[production]
10:01	<elukey>	reboot analytics1002 (Hadoop master standby) for kernel update	[production]
10:01	<moritzm>	rebooting auth* servers for kernel update	[production]
09:48	<ema>	reboot lvs[1010-1012] for kernel update	[production]
09:48	<elukey>	reboot aqs1005 for kernel update	[production]
09:10	<elukey>	reboot kafka2001 for kernel update (eventbus codfw)	[production]
09:06	<moritzm>	rebooting restbase1017 for kernel update	[production]
08:52	<oblivian@puppetmaster1001>	conftool action : set/pooled=inactive; selector: name=restbase2001.codfw.wmnet,dc=codfw,service=restbase	[production]
08:49	<_joe_>	correction: restarting pybal	[production]
08:49	<_joe_>	restarting etcd on lvs2003/2006, connection lost to etcd	[production]
08:34	<elukey>	reboot kafka1012 for kernel upgrades	[production]
08:34	<marostegui>	Deploy alter table db1070 s5 - T166207	[production]
08:33	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Depool db1070 - T166207 (duration: 00m 44s)	[production]
08:27	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Repool db1082 - T166207 (duration: 00m 45s)	[production]
08:26	<godog>	reimage ms-be1014 / 1015 with jessie	[production]
07:37	<marostegui>	Stop and reset slave s5 on dbstore2001 - T168354	[production]
06:23	<mutante>	planet2001 wget missing unpuppetized logo file from https://en.planet.wikimedia.org/images/planet-wm2.png - should fix puppet run	[production]
06:19	<marostegui>	Stop replication and puppet on db2066 for maintenance - T168354	[production]
06:18	<marostegui@tin>	Synchronized wmf-config/db-codfw.php: Depool db2066 - T168354 (duration: 00m 43s)	[production]
06:08	<elukey>	reboot thorium for kernel upgrades (outage to all the analytics websites)	[production]
06:05	<marostegui>	Deploy alter table s5 - db1082 - T166207	[production]
06:04	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Depool db1082 - T166207 (duration: 00m 44s)	[production]
06:04	<marostegui>	Deploy alter table s5 - dbstore1002 - T166207	[production]
05:59	<elukey>	reboot stat100[2,3,4] for kernel upgrades	[production]
05:57	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Repool db1087 - T166207 (duration: 00m 44s)	[production]
05:54	<marostegui>	Deploy alter table s5 - labsdb1011 - T166207	[production]
05:50	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Repool db1021 - T166205 (duration: 01m 00s)	[production]
05:41	<marostegui>	Start relearn BBU cycle on db1016 - T166344	[production]
03:13	<mutante>	planet - copying HTML files from docroot from planet1001 to planet2001 - (don't serve Debian default page)	[production]
03:03	<mutante>	planet1001 - remove/purge all php5* packages	[production]
02:57	<l10nupdate@tin>	ResourceLoader cache refresh completed at Wed Jun 21 02:57:19 UTC 2017 (duration 6m 41s)	[production]
02:50	<l10nupdate@tin>	scap sync-l10n completed (1.30.0-wmf.6) (duration: 06m 06s)	[production]
02:26	<l10nupdate@tin>	scap sync-l10n completed (1.30.0-wmf.5) (duration: 06m 52s)	[production]
01:45	<mutante>	planet1001 - remove php5 package	[production]
00:34	<mutante>	planet2001 - revoke old puppet cert, salt-key, re-add new cert/key after reinstall	[production]
00:24	<mutante>	planet2001 - scheduled downtime, reinstall with stretch	[production]
00:06	<mutante>	tin (deployment): manually remove l10nupdate cron, let puppet re-create it after gerrit:350749. stops l10nupdate cron from running on weekends. naos didn't need an action. (T164035).	[production]