production SAL

8601-8650 of 10000 results (88ms)

2016-10-24 §
14:57	<paravoid>	restarting ferm on es2015	[production]
14:54	<bblack>	starting ferm server on eeden, radon	[production]
14:41	<gehel@puppetmaster1001>	conftool action : set/pooled=yes; selector: dc=eqiad,cluster=maps,service=kartotherian,name=maps1002.eqiad.wmnet	[production]
14:38	<dereckson@mira>	Synchronized wmf-config/CommonSettings.php: Toggle wgDefaultUserOptions['watchdefault'] on for cs.wikipedia, off elsewhere (T148328, 2/2) (duration: 00m 50s)	[production]
14:36	<dereckson@mira>	Synchronized wmf-config/InitialiseSettings.php: Toggle wgDefaultUserOptions['watchdefault'] on for cs.wikipedia, off elsewhere (T148328, 1/2) (duration: 00m 54s)	[production]
14:36	<bblack>	disabling puppet on all caches ahead of port# work, to test - T107749 / https://gerrit.wikimedia.org/r/#/c/317405	[production]
14:29	<yurik>	re-deployed current kartotherian to all servers (maps1002 & maps-test* were stale)	[production]
14:11	<marostegui>	Deploy schema change s5 dewiki.revision - only codfw T148967	[production]
14:03	<l10nupdate@mira>	ResourceLoader cache refresh completed at Mon Oct 24 14:03:07 UTC 2016 (duration 6m 17s)	[production]
13:56	<dereckson@mira>	scap sync-l10n completed (1.28.0-wmf.22) (duration: 10m 46s)	[production]
13:42	<bblack>	restarting all varnish frontends (serially per-cluster with proper depooling, etc)	[production]
13:20	<elukey>	reimaging mc120[89] and mc1030	[production]
13:18	<Dereckson>	Started manually l10nupdate, as it didn't run for 6 days, and more especially to fix T148921 user-facing issue.	[production]
13:13	<dereckson@mira>	Synchronized wmf-config/throttle.php: Edit-a-thon BDA (Poitiers) throttle rule (T148852) (duration: 01m 13s)	[production]
10:47	<elukey>	reimaged mc102[56], currently doing mc1027	[production]
10:21	<_joe_>	rebooting kubernetes1002	[production]
09:20	<mobrovac>	change-prop deploying c7feda2	[production]
09:09	<mobrovac>	restbase deploy end of f9017ad	[production]
08:55	<akosiaris>	rebooting cobalt (gerrit) for kernel upgrades	[production]
08:53	<elukey>	reimaging mc1024	[production]
08:46	<mobrovac>	restbase deploy start of f9017ad	[production]
08:38	<gehel>	continue rolling restart of elasticsearch eqiad cluster	[production]
08:38	<hashar>	Restarting gallium (Jenkins/Zuul) for kernel upgrades	[production]
08:36	<akosiaris>	rebooting labnodepool1001 for kernel upgrades	[production]
08:36	<akosiaris>	rebooting scandium for kernel upgrades	[production]
08:33	<hashar>	rebooting contint1001	[production]
08:20	<elukey>	reimaging mc1023.eqiad.wmnet	[production]
07:46	<elukey>	reimaging mc1022.eqiad.wmnet (T137345)	[production]
07:09	<marosteg1i>	Deploying alter table s1.enwiki on codfw - T147166	[production]
2016-10-22 §
15:37	<oblivian@puppetmaster1001>	conftool action : set/pooled=no; selector: name=cp1052.eqiad.wmnet	[production]
15:02	<bblack@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=cp1052.eqiad.wmnet	[production]
15:02	<bblack>	repool cp1052 - T148891	[production]
14:52	<bblack>	rebooted cp1052 - T148891	[production]
14:26	<bblack>	depooled cp1052 (cache_text@eqiad, ethernet linkdown for unknown reasons)	[production]
12:34	<marostegui>	Stopping replication in db2055 to use it to clone another host - T146261	[production]
2016-10-21 §
23:45	<mutante>	depooling maps1002 (by running "depool" on the server itself)	[production]
23:35	<yurik>	maps1002.eqiad is running older/incorrect/misbehaving software for some reason, restart didn't help. Need to depool	[production]
22:17	<mutante>	cp4006,cp4014 gzipped some logs in home for disk space	[production]
22:08	<mutante>	cp4006, cp4014 were running out of disk, apt-get clean	[production]
21:40	<mutante>	phab2001 that IP was also on iridium/phab1001, it should not be hardcoded in puppet, causing issues in T143363	[production]
21:37	<mutante>	phab2001 - ip addr del 10.64.32.186/21 dev eth0	[production]
21:06	<bblack>	restarting varnish backends (depooled, etc) for eqiad cache_upload: cp1049, cp1072, cp1074	[production]
19:50	<cmjohnson1>	dataset1001 array 1 swap failed disk slot 4	[production]
19:40	<cmjohnson1>	labvirt1005 swapping disk 0	[production]
19:40	<gehel>	routing traffic for cache-maps in codfw -> eqiad	[production]
19:29	<gehel>	running puppet on eqiad cache nodes to activate maps traffic redirection	[production]
19:06	<gehel>	shutting down cassandra on maps2004, seems to have lost data	[production]
18:22	<ejegg>	updated SmashPig from d1ca0632d00dfb608f70ca4b70251a5ba49f4411 to e28b2cd9f0c1429acdd2a08c68f95884dbffb594	[production]
16:45	<ejegg>	updated fundraising tools from 09ae6e24d8ca8350dc099d63a6ca0d9ec9fdef2b to f83e39291adc55677fc4b49307dc4807eba18019	[production]
16:33	<mutante>	rebooting planet1001 - *.planet.wm.org will be right back	[production]