production SAL

1401-1450 of 10000 results (54ms)

2019-10-10 §
19:20	<dduvall@deploy1001>	rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.1	[production]
19:11	<dduvall@deploy1001>	rebuilt and synchronized wikiversions files: labswiki to 1.35.0-wmf.1	[production]
19:09	<dduvall@deploy1001>	Synchronized php-1.35.0-wmf.1/extensions/OpenStackManager: labswiki to 1.35.0-wmf.1 (duration: 01m 00s)	[production]
19:04	<marxarelli>	promoting labswiki to 1.35.0-wmf.1 cc: T233849	[production]
17:07	<jbond42>	puppetmaster1001 has been upgraded and is back serving requests	[production]
16:21	<urandom>	Upgrading sessionstore200[1-3].codfw.wmnet to Cassandra 3.11.4 -- T200803	[production]
16:18	<urandom>	Upgrading sessionstore1003.eqiad.wmnet to Cassandra 3.11.4 -- T200803	[production]
16:16	<urandom>	Upgrading sessionstore1002.eqiad.wmnet to Cassandra 3.11.4 -- T200803	[production]
16:11	<@>	helmfile [EQIAD] Ran 'apply' command on namespace 'termbox' for release 'production' .	[production]
16:07	<@>	helmfile [CODFW] Ran 'apply' command on namespace 'termbox' for release 'production' .	[production]
16:04	<thcipriani>	restarting gerrit due to T224448	[production]
16:04	<@>	helmfile [STAGING] Ran 'apply' command on namespace 'termbox' for release 'staging' .	[production]
16:01	<urandom>	Upgrading sessionstore1001.eqiad.wmnet to Cassandra 3.11.4 -- T200803	[production]
15:42	<@>	helmfile [STAGING] Ran 'apply' command on namespace 'termbox' for release 'test' .	[production]
15:23	<mholloway-shell@deploy1001>	Finished deploy [mobileapps/deploy@1adf74e]: Update mobileapps to c89aa55 (duration: 05m 39s)	[production]
15:18	<mholloway-shell@deploy1001>	Started deploy [mobileapps/deploy@1adf74e]: Update mobileapps to c89aa55	[production]
14:57	<marostegui@cumin1001>	dbctl commit (dc=all): 'Fully repool db1074 after getting its BBU replaced T231638', diff saved to https://phabricator.wikimedia.org/P9306 and previous config saved to /var/cache/conftool/dbconfig/20191010-145737-marostegui.json	[production]
14:54	<moritzm>	ran systemctl reset-failed on puppetmaster1001 (puppet-master.service after reimage)	[production]
14:42	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1074 after BBU replacement T231638', diff saved to https://phabricator.wikimedia.org/P9305 and previous config saved to /var/cache/conftool/dbconfig/20191010-144201-marostegui.json	[production]
14:39	<marostegui@cumin1001>	dbctl commit (dc=all): 'Fully repool db1112 into recentchanges and remove db1078 from it after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9304 and previous config saved to /var/cache/conftool/dbconfig/20191010-143924-marostegui.json	[production]
14:36	<marostegui@cumin1001>	dbctl commit (dc=all): 'Fully repool to db1084 db1083 db1076 db1112 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9303 and previous config saved to /var/cache/conftool/dbconfig/20191010-143633-marostegui.json	[production]
14:23	<marostegui@cumin1001>	dbctl commit (dc=all): 'More traffic to db1084 db1083 db1076 db1112 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9302 and previous config saved to /var/cache/conftool/dbconfig/20191010-142323-marostegui.json	[production]
14:13	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1084 db1083 db1076 db1112 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9301 and previous config saved to /var/cache/conftool/dbconfig/20191010-141303-marostegui.json	[production]
14:04	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Fully repool es1013, es1014 after PDU maintenance (duration: 00m 59s)	[production]
14:03	<jbond42>	re-enable puppet now ca has been correctly moved	[production]
13:58	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1112 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9300 and previous config saved to /var/cache/conftool/dbconfig/20191010-135806-marostegui.json	[production]
13:57	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1084 db1083 db1076 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9299 and previous config saved to /var/cache/conftool/dbconfig/20191010-135659-marostegui.json	[production]
13:50	<jbond42>	disable puppet fleet wide as puppetmaster2002 is stuggeling	[production]
13:32	<jbond42>	reimage puppetmaster1001	[production]
13:27	<marostegui>	Repool labsdb1011 after reclone - T235016	[production]
13:16	<arturo>	added flannel 0.5.5-4 to buster-wikimedia (T235059)	[production]
13:05	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: More traffic to es1013, es1014 after PDU maintenance (duration: 00m 58s)	[production]
13:00	<jbond@cumin2001>	END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)	[production]
12:41	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Slowly repool es1013, es1014 after PDU maintenance (duration: 00m 59s)	[production]
11:57	<jbond@cumin2001>	Updating IPMI password on 1253 hosts - jbond@cumin2001	[production]
11:57	<jbond@cumin2001>	START - Cookbook sre.hosts.ipmi-password-reset	[production]
11:48	<jbond@cumin2001>	END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)	[production]
11:46	<jbond@cumin2001>	Updating IPMI password on 35 hosts - jbond@cumin2001	[production]
11:46	<jbond@cumin2001>	START - Cookbook sre.hosts.ipmi-password-reset	[production]
11:41	<lucaswerkmeister-wmde@deploy1001>	Synchronized wmf-config/Wikibase.php: [[gerrit:542087\|Fix typo in beta repo data bridge config (T235033)]] (duration: 00m 59s)	[production]
11:40	<marostegui>	Deploy schema change on s7 codfw master (db2118), this will generate lag on s7 codfw - T234066 T233135	[production]
11:38	<jbond@cumin2001>	END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99)	[production]
11:38	<jbond@cumin2001>	Updating IPMI password on 1253 hosts - jbond@cumin2001	[production]
11:38	<jbond@cumin2001>	START - Cookbook sre.hosts.ipmi-password-reset	[production]
11:37	<arturo>	icinga downtime cloudvirt1023 for 2h (T227536)	[production]
11:36	<arturo>	icinga downtime cloudvirt1025 for 2h (T227536)	[production]
11:36	<jbond@cumin2001>	END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99)	[production]
11:36	<jbond@cumin2001>	Updating IPMI password on 1253 hosts - jbond@cumin2001	[production]
11:36	<jbond@cumin2001>	START - Cookbook sre.hosts.ipmi-password-reset	[production]
11:35	<arturo>	icinga downtime cloudvirt1026 for 2h (T227536)	[production]