production SAL

251-300 of 10000 results (59ms)

2019-06-24 §
14:43	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
14:37	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
14:37	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
14:15	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
14:15	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
14:01	<ema>	cp3032: upgrade varnish to 5.1.3-1wm11 T226375	[production]
13:51	<jbond42>	rolling restart of the conf servers starting in 10 minutes please let me know if you forsee any issue	[production]
13:50	<reedy@deploy1001>	Synchronized wmf-config/flaggedrevs.php: T225144 T225276 T225414 T225776 T225797 T226054 (duration: 00m 56s)	[production]
13:26	<moritzm>	re-enabling TCP SACKs on cp4024-4029 (half of Varnish/text and Varnish/upload in ulsfo) T225998	[production]
13:25	<jbond42>	update libviry on cloudvirt* stretch servers	[production]
13:19	<moritzm>	re-enabling TCP SACKs on cp3040-cp3047, cp3049 (half of Varnish/text and Varnish/upload in esams) T225998	[production]
13:10	<moritzm>	re-enabling TCP SACKs on cp2001,2002,2004-2008,2010,2011, 2014, 2017 (half of Varnish/text and Varnish/upload in codfw) T225998	[production]
13:04	<moritzm>	re-enabling TCP SACKs on cp1075-1082 (half of Varnish/text and Varnish/upload in eqiad) T225998	[production]
13:00	<gehel>	shutdown wdqs updater on wdqs/public/eqiad	[production]
12:49	<gehel>	restarting blazegraph on wdqs1004 (JVM thread out of control)	[production]
11:31	<Lucas_WMDE>	EU SWAT done	[production]
11:30	<lucaswerkmeister-wmde@deploy1001>	Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: [[gerrit:518186\|Labs: enable QuickSurveys on hewiki (T225819)]] (duration: 00m 57s)	[production]
10:36	<moritzm>	re-enabling TCP SACKs on cp5007-cp5009 (half of Varnish/text in eqsin) T225998	[production]
10:28	<moritzm>	re-enabling TCP SACKs on cp5001-cp5003 (half of Varnish/upload in eqsin) T225998	[production]
09:23	<elukey>	reboot of kafka-jumbo100[1-6] for kernel + openjdk upgrades	[production]
08:56	<elukey>	re-enable eventloggign mysql consumers after maintenance on eventlog1002	[production]
08:52	<marostegui>	Upgrade Mysql on db1140 (checked that all snapshots backups are done) - T226358	[production]
08:42	<elukey>	reboot an-master100[1,2] for kernel + openjdk upgrades	[production]
08:38	<jynus>	upgrade, stop and restart db1108	[production]
08:34	<jynus>	reloading haproxy on dbproxy1004/9	[production]
08:24	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Repool db1120 after upgrade T226358 (duration: 00m 56s)	[production]
08:14	<jynus>	upgrade, stop and restart db1107	[production]
08:09	<marostegui>	Stop MySQL on db1120 for upgrade - T226358	[production]
08:08	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Depool db1120 for upgrade T226358 (duration: 00m 56s)	[production]
07:51	<elukey>	stop mysql consumer on eventlog1002 (so traffic to db1107 will be stopped, to allow maintenance to happen)	[production]
07:06	<moritzm>	installing vim update for stretch	[production]
06:31	<_joe_>	publishing docker-registry.wikimedia.org/nodejs10-slim:0.0.2, T226346	[production]
06:16	<elukey>	powercycle analytics1060 (stuck, no ssh, no console com2 available)	[production]
06:01	<marostegui>	Stop MySQL on db1117:3321 to clone db1135 (haproxy alert will be triggered) - T222682	[production]
05:57	<_joe_>	rebuilding base debian/alpine images to pick up security updates	[production]
05:07	<marostegui@deploy1001>	Synchronized wmf-config/db-codfw.php: Remove db1135 from config T222682 (duration: 00m 55s)	[production]
05:06	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Remove db1135 from config T222682 (duration: 01m 07s)	[production]
04:59	<marostegui>	Rename table wikimedia_editor_tasks_entity_description_exists in db1123 (testwikidatawiki) T226326	[production]
04:54	<marostegui>	Rename table wikimedia_editor_tasks_entity_description_exists in db1092 T226326	[production]
2019-06-21 §
14:54	<jmm@cumin2001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
14:53	<jmm@cumin2001>	START - Cookbook sre.hosts.downtime	[production]
14:51	<moritzm>	rebooting planet1001 to pick up MDS mitigations/new kernel	[production]
14:50	<jmm@cumin2001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
14:50	<jmm@cumin2001>	START - Cookbook sre.hosts.downtime	[production]
14:50	<jmm@cumin2001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
14:49	<jmm@cumin2001>	START - Cookbook sre.hosts.downtime	[production]
14:37	<moritzm>	rebooting kerberos1001 to pick up MDS mitigations/new kernel	[production]
14:26	<jmm@cumin2001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
14:26	<jmm@cumin2001>	START - Cookbook sre.hosts.downtime	[production]
14:23	<ema@cumin1001>	END (PASS) - Cookbook sre.hosts.upgrade-and-reboot (exit_code=0)	[production]