production SAL

1151-1200 of 10000 results (61ms)

2019-10-22 §
17:57	<sbassett>	Deployed security fix for T234450 to wmf.2	[production]
17:57	<mholloway-shell@deploy1001>	Finished deploy [mobileapps/deploy@b4c484a]: Build structured talk pages by walking the DOM (T235213) (duration: 05m 14s)	[production]
17:54	<mutante>	restarting gerrit to disable jgit gc (T236114)	[production]
17:51	<mholloway-shell@deploy1001>	Started deploy [mobileapps/deploy@b4c484a]: Build structured talk pages by walking the DOM (T235213)	[production]
17:37	<arlolra>	Updated Parsoid to cf01d91 (T234057, T234768, T235296, T235684, T235563)	[production]
17:26	<arlolra@deploy1001>	Finished deploy [parsoid/deploy@4c64c9c]: Updating Parsoid to cf01d91 (duration: 07m 37s)	[production]
17:20	<bblack>	geodns: re-pooling esams (at this point, we're entirely back in our "normal" state of affairs)	[production]
17:19	<arlolra@deploy1001>	Started deploy [parsoid/deploy@4c64c9c]: Updating Parsoid to cf01d91	[production]
16:51	<bblack>	geodns: moving all "normal" eqiad traffic back to eqiad (in addition to the esams-diverted traffic which is still pointed mostly at eqiad right now)	[production]
16:21	<mutante>	running puppet on deployment servers	[production]
16:20	<thcipriani>	restarting gerrit	[production]
16:14	<thcipriani>	stopping gerrit to run a fix for T222391	[production]
15:58	<bblack>	depooling esams temporarily to test traffic scenario on lvs1014	[production]
15:47	<bblack>	enable pybal+puppet on rebooted lvs1014	[production]
15:40	<bblack>	rebooting lvs1014	[production]
15:28	<liw@deploy1001>	Finished scap: testwiki to php-1.35.0-wmf.3 and rebuild l10n cache (duration: 37m 39s)	[production]
15:26	<XioNoX>	repool esams	[production]
15:20	<XioNoX>	rollback ns2 redirect	[production]
15:13	<bblack>	re-disabling lvs1014 ...	[production]
15:10	<bblack>	re-enabling lvs1014 pybal/puppet	[production]
15:03	<moritzm>	rebooting kafka-main1005 for microcode debugging	[production]
15:01	<jmm@cumin2001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
15:01	<jmm@cumin2001>	START - Cookbook sre.hosts.downtime	[production]
14:52	<bblack>	stopping puppet and pybal on lvs1014 (upload+maps traffic to 1016)	[production]
14:50	<liw@deploy1001>	Started scap: testwiki to php-1.35.0-wmf.3 and rebuild l10n cache	[production]
14:45	<mbsantos@deploy1001>	Finished deploy [kartotherian/deploy@85ea6e1]: Deploy kartotherian 1.1.5-wmf.0 (duration: 02m 44s)	[production]
14:42	<mbsantos@deploy1001>	Started deploy [kartotherian/deploy@85ea6e1]: Deploy kartotherian 1.1.5-wmf.0	[production]
14:13	<XioNoX>	restart asw-esams for onsite work	[production]
13:52	<andrewbogott>	restarted slapd on ldap-eqiad-replica01	[production]
13:38	<gehel>	silencing LVS check for katotherian (we know there is an issue) - T236163	[production]
13:35	<liw@deploy1001>	scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="labtestwiki" --outdir="/tmp/scap_l10n_2419219323" --threads=30 --lang en --quiet' returned non-zero exit status 1 (duration: 06m 40s)	[production]
13:28	<liw@deploy1001>	Started scap: testwiki to php-1.34.0-wmf.3 and rebuild l10n cache	[production]
13:13	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
13:13	<ayounsi@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
13:06	<XioNoX>	depool esams for onsite work - T235805	[production]
13:05	<marostegui@cumin1001>	dbctl commit (dc=all): 'Fully repool db1096:3316 db1105:3311 db1105:3312 after PDU and on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9434 and previous config saved to /var/cache/conftool/dbconfig/20191022-130556-marostegui.json	[production]
12:54	<marostegui@cumin1001>	dbctl commit (dc=all): 'More traffic to db1096:3316 db1105:3311 instance db1105:3312 after PDU and on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9433 and previous config saved to /var/cache/conftool/dbconfig/20191022-125435-marostegui.json	[production]
12:46	<marostegui@cumin1001>	dbctl commit (dc=all): 'More traffic to db1096:3316 db1105:3311 instance db1105:3312 after PDU and on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9432 and previous config saved to /var/cache/conftool/dbconfig/20191022-124607-marostegui.json	[production]
12:37	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1096:3316 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9431 and previous config saved to /var/cache/conftool/dbconfig/20191022-123757-marostegui.json	[production]
12:32	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1105:3312 and db1105:3311 after on-site maintenance T235877', diff saved to https://phabricator.wikimedia.org/P9430 and previous config saved to /var/cache/conftool/dbconfig/20191022-123257-marostegui.json	[production]
12:30	<marostegui@cumin1001>	dbctl commit (dc=all): 'Repool db2089:3315', diff saved to https://phabricator.wikimedia.org/P9429 and previous config saved to /var/cache/conftool/dbconfig/20191022-123032-marostegui.json	[production]
12:29	<moritzm>	rebooting miscweb2001 for some microcode tests	[production]
12:28	<marostegui>	Compress db1096:3315	[production]
12:27	<jmm@cumin2001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
12:27	<jmm@cumin2001>	START - Cookbook sre.hosts.downtime	[production]
12:25	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Repool pc1007 after PDU maintenance T227142 (duration: 00m 50s)	[production]
12:14	<jynus>	reimage to buster dbmonitor2001.wikimedia.org T224589	[production]
11:57	<liw>	starting to cut branch for train 1.35-wmf.3	[production]
11:51	<hashar>	Restarted CI Jenkins on contint1001	[production]
11:35	<marostegui>	Stop MySQL on db1105:3311, db1105:3312 for firmware upgrade - T235877	[production]