production SAL

3651-3700 of 10000 results (55ms)

2018-04-11 §
13:27	<marostegui>	Drop prefstats table on s3 sanitarium master (db1072) this might cause lag on labs - T154490	[production]
13:26	<moritzm>	installing java security updates on kafka/main cluster	[production]
13:25	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Depool db1072 (duration: 01m 00s)	[production]
13:13	<marostegui>	Drop prefstats table on s1 codfw master - db2048 (this might generate lag on codfw) - T154490	[production]
13:12	<elukey>	restart kafka brokers on kafka1012->23 for openjdk-7 upgrades	[production]
13:09	<marostegui>	Drop prefstats table on s3 codfw master - db2043 (this might generate lag on codfw) - T154490	[production]
13:01	<vgutierrez>	Reimage lvs4007 as stretch	[production]
13:00	<jynus@tin>	Synchronized wmf-config/db-codfw.php: Repool es2012 (duration: 01m 00s)	[production]
12:39	<mobrovac@tin>	Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 (retry #2) (duration: 01m 01s)	[production]
12:32	<mobrovac@tin>	Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 (retry) - T190327 (duration: 01m 00s)	[production]
12:21	<mobrovac@tin>	Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 - T190327 (duration: 01m 01s)	[production]
12:21	<moritzm>	enable production traffic for mw1265 (stretch app server) for a brief test period	[production]
12:09	<jynus>	start reimage of es2012	[production]
12:05	<jynus@tin>	Synchronized wmf-config/db-codfw.php: Repool es2011, depool es2012 (duration: 01m 01s)	[production]
11:47	<jynus>	start reimage of es2011	[production]
11:09	<ema>	start pybal on lvs5001, test completed on lvs5003	[production]
11:04	<marostegui>	Drop table prefstats in s7 - T154490	[production]
10:59	<jynus@tin>	Synchronized wmf-config/db-codfw.php: Repool es2015, depool es2011 (duration: 00m 59s)	[production]
10:56	<ema>	stop pybal on lvs5001 to test requests through lvs5003, reimaged as stretch T191897	[production]
10:50	<moritzm>	installing openssl updates	[production]
10:43	<marostegui>	Drop table prefstats in s2 - T154490	[production]
10:33	<marostegui>	Drop table prefstats in s4 - T154490	[production]
10:31	<marostegui>	Drop table prefstats in s6 - T154490	[production]
10:28	<marostegui>	Drop table prefstats in s5 - T154490	[production]
10:04	<jynus>	start reimage of es2015	[production]
10:00	<moritzm>	installing java security updates on kafka/jumbo cluster	[production]
09:57	<jynus@tin>	Synchronized wmf-config/db-codfw.php: Repool es2014, depool es2015 (duration: 01m 02s)	[production]
09:52	<moritzm>	installing java security updates on kafka/analytics cluster	[production]
09:29	<arturo>	doing some testing in labtestvirt2001 mounting instance's qcow2 files into /home/aborrero/mnt	[production]
09:17	<jynus>	start reimage of es2014	[production]
09:08	<jynus@tin>	Synchronized wmf-config/db-codfw.php: Depool es2014 (duration: 01m 03s)	[production]
09:03	<ema>	restart pybal on lvs1003 for UDP monitoring config changes https://gerrit.wikimedia.org/r/#/c/425251/	[production]
08:59	<moritzm>	reimaging mw1265 to stretch (T174431)	[production]
08:18	<jynus>	rerunning eqiad misc backups	[production]
08:03	<marostegui@tin>	Synchronized wmf-config/db-codfw.php: Repool db2069 as candidate master for x1 - T191275 (duration: 01m 03s)	[production]
07:45	<ema>	cp2022: restart varnish-be due to child process crash https://phabricator.wikimedia.org/P6979 T191229	[production]
07:27	<marostegui>	Stop MySQL on db2033 to copy its data away before reimaging - T191275	[production]
07:08	<vgutierrez>	Reimaging lvs5003.eqsin as stretch (2nd attempt)	[production]
06:49	<elukey>	restart Yarn Resource Manager daemons on analytics100[12] to pick up the new Prometheus configuration file	[production]
06:20	<marostegui>	Stop MySQL on db2033 to clone db2069 - T191275	[production]
06:17	<marostegui@tin>	Synchronized wmf-config/db-eqiad.php: Add db2069 to the config as depooled x1 slave - T191275 (duration: 01m 03s)	[production]
06:15	<marostegui@tin>	Synchronized wmf-config/db-codfw.php: Add db2069 to the config as depooled x1 slave - T191275 (duration: 01m 01s)	[production]
05:28	<Krinkle>	manual coal back-fill still running with the normal coal disabled via systemd. Will restore normal coal when I wake up.	[production]
05:22	<marostegui>	Deploy schema change on codfw s8 master (db2045) with replication enabled (this will generate lag on codfw) - T187089 T185128 T153182	[production]
05:17	<marostegui>	Reload haproxy on dbprox1010 to repool labsdb1010	[production]
02:36	<l10nupdate@tin>	scap sync-l10n completed (1.31.0-wmf.28) (duration: 05m 41s)	[production]
00:12	<bstorm_>	Updated views and indexes on labsdb1011	[production]
2018-04-10 §
23:32	<XioNoX>	depolled eqsin due to router issue	[production]
23:04	<Krinkle>	Seemingly from 22:53 - 23:03 global traffic dropped by 30-60%, presumably due to issues in eqiad where 10 Gbits dropped to 3 Gbits sharper than ever before.	[production]
22:49	<joal@tin>	Finished deploy [analytics/refinery@33448cd]: Deploying fixes after todays deploy errors (duration: 04m 46s)	[production]