production SAL

51-100 of 10000 results (27ms)

2021-07-13 §
13:30	<jmm@cumin2002>	END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 2 hosts	[production]
13:29	<jmm@cumin2002>	START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 2 hosts	[production]
13:14	<kormat>	restarted replication on db1117:3325 T284622	[production]
13:11	<jmm@cumin2002>	END (FAIL) - Cookbook sre.idm.logout (exit_code=99) Logging Muehlenhoff out of all services on: 1732 hosts	[production]
13:10	<jmm@cumin2002>	START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 1732 hosts	[production]
13:10	<hashar>	Upgraded Apache on gerrit1001 and gerrit2001	[production]
13:09	<jmm@cumin2002>	END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 1732 hosts	[production]
13:08	<jmm@cumin2002>	START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 1732 hosts	[production]
12:55	<jmm@cumin2002>	END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 1732 hosts	[production]
12:53	<kormat>	stopping replication on db1117:3325 T284622	[production]
12:53	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1117.eqiad.wmnet with reason: Copy m5 from db1117 to db1183 T284622	[production]
12:53	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on db1117.eqiad.wmnet with reason: Copy m5 from db1117 to db1183 T284622	[production]
12:43	<jmm@cumin2002>	START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 1732 hosts	[production]
12:41	<mutante>	depooling and decom'ing eqiad API servers mw1281, mw1282, mw1283 - T280203	[production]
12:40	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw128[1-3].eqiad.wmnet	[production]
12:20	<mutante>	mwmaint1002 - scap pull after reimaging	[production]
11:33	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwmaint1002.eqiad.wmnet with reason: REIMAGE	[production]
11:31	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on mwmaint1002.eqiad.wmnet with reason: REIMAGE	[production]
11:28	<Lucas_WMDE>	EU backport+config window done	[production]
11:25	<lucaswerkmeister-wmde@deploy1002>	Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:704304\|Remove obsolete $wgShowDBErrorBacktrace config]] (duration: 01m 25s)	[production]
11:13	<mutante>	mwmaint1002 - reimaging with buster (T267607)	[production]
10:54	<mutante>	switching https://noc.wikimedia.org backened from eqiad to codfw for mwmaint1002 OS upgrade, not affecting config-master/pybal, tests passed (T267607)	[production]
10:44	<moritzm>	upgrading apache on phab1001 (phabricator.wikimedia.org)	[production]
10:39	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on maps2008.codfw.wmnet with reason: reimaging as buster replica	[production]
10:39	<hnowlan@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on maps2008.codfw.wmnet with reason: reimaging as buster replica	[production]
10:39	<hnowlan>	running `nodetool decommission` on maps2008	[production]
10:27	<moritzm>	installing apache security updates on alert1001 (icinga.wikimedia.org)	[production]
10:21	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 18 hosts with reason: Deploying schema change to s1 T277116	[production]
10:21	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on 18 hosts with reason: Deploying schema change to s1 T277116	[production]
10:18	<moritzm>	installing apache security updates on Logstash hosts	[production]
09:58	<moritzm>	upgrading PHP/Apache on matomo1002 (piwik.wikimedia.org)	[production]
09:40	<moritzm>	installing apache security updates on thanos-fe hosts	[production]
09:38	<moritzm>	installing apache security updates on parsoid hosts	[production]
09:31	<effie>	depool mw2383 T286463	[production]
09:18	<volans@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1001.eqiad.wmnet	[production]
09:15	<volans@cumin2002>	START - Cookbook sre.hosts.reboot-single for host sretest1001.eqiad.wmnet	[production]
09:00	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 12 hosts with reason: Deploying schema change to s3 T277116	[production]
09:00	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on 12 hosts with reason: Deploying schema change to s3 T277116	[production]
08:59	<volans@cumin2002>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on sretest1001.eqiad.wmnet with reason: testing the cookbook	[production]
08:59	<volans@cumin2002>	START - Cookbook sre.hosts.downtime for 0:10:00 on sretest1001.eqiad.wmnet with reason: testing the cookbook	[production]
08:45	<effie>	depool mw2383 - T286463	[production]
08:02	<moritzm>	upgrade bullseye pilot installs to latest state of bullseye	[production]
07:06	<moritzm>	installing apache security updates on codfw mw* hosts	[production]
06:53	<elukey>	systemctl reset-failed ifup@ens5 on gitlab2001 - T273026	[production]
06:06	<effie>	pool mw2383 - T286463	[production]
04:09	<ryankemper>	[WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`	[production]
04:09	<ryankemper>	[WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`	[production]
04:09	<ryankemper>	[WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`	[production]
04:05	<ryankemper@deploy1002>	Finished deploy [wdqs/wdqs@36f74b3]: 0.3.76 (duration: 08m 28s)	[production]
03:56	<ryankemper@deploy1002>	Started deploy [wdqs/wdqs@36f74b3]: 0.3.76	[production]