production SAL

4101-4150 of 10000 results (43ms)

2016-10-21 §
23:35	<yurik>	maps1002.eqiad is running older/incorrect/misbehaving software for some reason, restart didn't help. Need to depool	[production]
22:17	<mutante>	cp4006,cp4014 gzipped some logs in home for disk space	[production]
22:08	<mutante>	cp4006, cp4014 were running out of disk, apt-get clean	[production]
21:40	<mutante>	phab2001 that IP was also on iridium/phab1001, it should not be hardcoded in puppet, causing issues in T143363	[production]
21:37	<mutante>	phab2001 - ip addr del 10.64.32.186/21 dev eth0	[production]
21:06	<bblack>	restarting varnish backends (depooled, etc) for eqiad cache_upload: cp1049, cp1072, cp1074	[production]
19:50	<cmjohnson1>	dataset1001 array 1 swap failed disk slot 4	[production]
19:40	<cmjohnson1>	labvirt1005 swapping disk 0	[production]
19:40	<gehel>	routing traffic for cache-maps in codfw -> eqiad	[production]
19:29	<gehel>	running puppet on eqiad cache nodes to activate maps traffic redirection	[production]
19:06	<gehel>	shutting down cassandra on maps2004, seems to have lost data	[production]
18:22	<ejegg>	updated SmashPig from d1ca0632d00dfb608f70ca4b70251a5ba49f4411 to e28b2cd9f0c1429acdd2a08c68f95884dbffb594	[production]
16:45	<ejegg>	updated fundraising tools from 09ae6e24d8ca8350dc099d63a6ca0d9ec9fdef2b to f83e39291adc55677fc4b49307dc4807eba18019	[production]
16:33	<mutante>	rebooting planet1001 - *.planet.wm.org will be right back	[production]
16:30	<mutante>	rebooting planet2001	[production]
16:05	<elukey>	reimaging mc1021 with wmf-auto-reimage (T137345)	[production]
15:28	<elukey>	reimaging mc1019 with wmf-auto-reimage (T137345)	[production]
14:50	<elukey>	reimaging mc1020 with wmf-auto-reimage (T137345)	[production]
14:31	<_joe_>	rebooting all kubernetes worker nodes in production	[production]
14:31	<moritzm>	rolling reboot of thumbor* for kernel update	[production]
14:29	<marostegui>	Stopping replication on db2055 to use it to clone another host - T146261	[production]
13:55	<bblack>	restart isc-dhcp-server on carbon	[production]
13:55	<moritzm>	rolling reboot of thumbor* for kernel update	[production]
13:40	<moritzm>	completed rolling reboot of restbase in codfw	[production]
13:14	<marostegui>	Deploying schema change S6 ruwiki for table ores_model - T147734	[production]
12:24	<moritzm>	rebooting ruthenium for kernel update	[production]
12:02	<moritzm>	rebooting bromine for kernel update	[production]
11:28	<gehel>	starting rolling restart of elasticsearch eqiad cluster	[production]
11:04	<moritzm>	rebooting hafnium for kernel update	[production]
10:49	<jynus@mira>	Synchronized wmf-config/db-eqiad.php: mariadb: pool db1053 as the new rc special slave after maintenance (duration: 01m 00s)	[production]
10:36	<marostegui>	Deploying schema change S2 several wikis for table ores_model - T147734	[production]
10:28	<bblack>	rebooting radon (ns0)	[production]
10:22	<moritzm>	rolling reboot of restbase in codfw for kernel update	[production]
10:09	<marostegui>	Deploying schema change S7 fawiki.ores_model - T147734	[production]
10:04	<moritzm>	rebooting seaborgium (labs LDAP server) for kernel update	[production]
09:51	<marostegui>	Deploying schema change S5 wikidatawiki.ores_model - T147734	[production]
09:48	<moritzm>	rebooting neon (icinga host) for kernel update	[production]
09:35	<marostegui>	Deploying schema change S1 enwiki.ores_model in eqiad - T147734	[production]
09:32	<elukey>	rebooting kafka100[12] for kernel upgrades (EventBus hosts)	[production]
09:26	<moritzm>	rebooting krypton for kernel update	[production]
09:18	<godog>	start rolling reboot of ms-be machines in eqiad for kernel update	[production]
09:15	<moritzm>	rebooting meitnerium (archiva.wikimedia.org) for kernel update	[production]
09:13	<jynus>	reviewing and applying new watchdog events to all core dbs T148790	[production]
09:06	<moritzm>	rebooting serpens (labs LDAP server) for kernel update	[production]
08:49	<moritzm>	rebooting ununpentium (RT) for kernel update	[production]
08:40	<marostegui>	Deploying schema change S1 enwiki.ores_model in codfw - T147734	[production]
08:38	<moritzm>	rebooting radium (tor relay) for kernel update	[production]
08:35	<moritzm>	rebooting aluminium (url_downloader for eqiad) for kernel update	[production]
08:25	<moritzm>	rebooting alsafi (url_downloader for codfw) for kernel update	[production]
08:23	<jynus>	applying events_coredb_slave.sql to db1070	[production]