4001-4050 of 10000 results (78ms)
2019-10-10 ยง
16:21 <urandom> Upgrading sessionstore200[1-3].codfw.wmnet to Cassandra 3.11.4 -- T200803 [production]
16:18 <urandom> Upgrading sessionstore1003.eqiad.wmnet to Cassandra 3.11.4 -- T200803 [production]
16:16 <urandom> Upgrading sessionstore1002.eqiad.wmnet to Cassandra 3.11.4 -- T200803 [production]
16:11 <@> helmfile [EQIAD] Ran 'apply' command on namespace 'termbox' for release 'production' . [production]
16:07 <@> helmfile [CODFW] Ran 'apply' command on namespace 'termbox' for release 'production' . [production]
16:04 <thcipriani> restarting gerrit due to T224448 [production]
16:04 <@> helmfile [STAGING] Ran 'apply' command on namespace 'termbox' for release 'staging' . [production]
16:01 <urandom> Upgrading sessionstore1001.eqiad.wmnet to Cassandra 3.11.4 -- T200803 [production]
15:42 <@> helmfile [STAGING] Ran 'apply' command on namespace 'termbox' for release 'test' . [production]
15:23 <mholloway-shell@deploy1001> Finished deploy [mobileapps/deploy@1adf74e]: Update mobileapps to c89aa55 (duration: 05m 39s) [production]
15:18 <mholloway-shell@deploy1001> Started deploy [mobileapps/deploy@1adf74e]: Update mobileapps to c89aa55 [production]
14:57 <marostegui@cumin1001> dbctl commit (dc=all): 'Fully repool db1074 after getting its BBU replaced T231638', diff saved to https://phabricator.wikimedia.org/P9306 and previous config saved to /var/cache/conftool/dbconfig/20191010-145737-marostegui.json [production]
14:54 <moritzm> ran systemctl reset-failed on puppetmaster1001 (puppet-master.service after reimage) [production]
14:42 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1074 after BBU replacement T231638', diff saved to https://phabricator.wikimedia.org/P9305 and previous config saved to /var/cache/conftool/dbconfig/20191010-144201-marostegui.json [production]
14:39 <marostegui@cumin1001> dbctl commit (dc=all): 'Fully repool db1112 into recentchanges and remove db1078 from it after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9304 and previous config saved to /var/cache/conftool/dbconfig/20191010-143924-marostegui.json [production]
14:36 <marostegui@cumin1001> dbctl commit (dc=all): 'Fully repool to db1084 db1083 db1076 db1112 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9303 and previous config saved to /var/cache/conftool/dbconfig/20191010-143633-marostegui.json [production]
14:23 <marostegui@cumin1001> dbctl commit (dc=all): 'More traffic to db1084 db1083 db1076 db1112 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9302 and previous config saved to /var/cache/conftool/dbconfig/20191010-142323-marostegui.json [production]
14:13 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1084 db1083 db1076 db1112 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9301 and previous config saved to /var/cache/conftool/dbconfig/20191010-141303-marostegui.json [production]
14:04 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Fully repool es1013, es1014 after PDU maintenance (duration: 00m 59s) [production]
14:03 <jbond42> re-enable puppet now ca has been correctly moved [production]
13:58 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1112 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9300 and previous config saved to /var/cache/conftool/dbconfig/20191010-135806-marostegui.json [production]
13:57 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1084 db1083 db1076 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9299 and previous config saved to /var/cache/conftool/dbconfig/20191010-135659-marostegui.json [production]
13:50 <jbond42> disable puppet fleet wide as puppetmaster2002 is stuggeling [production]
13:32 <jbond42> reimage puppetmaster1001 [production]
13:27 <marostegui> Repool labsdb1011 after reclone - T235016 [production]
13:16 <arturo> added flannel 0.5.5-4 to buster-wikimedia (T235059) [production]
13:05 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: More traffic to es1013, es1014 after PDU maintenance (duration: 00m 58s) [production]
13:00 <jbond@cumin2001> END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0) [production]
12:41 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Slowly repool es1013, es1014 after PDU maintenance (duration: 00m 59s) [production]
11:57 <jbond@cumin2001> Updating IPMI password on 1253 hosts - jbond@cumin2001 [production]
11:57 <jbond@cumin2001> START - Cookbook sre.hosts.ipmi-password-reset [production]
11:48 <jbond@cumin2001> END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0) [production]
11:46 <jbond@cumin2001> Updating IPMI password on 35 hosts - jbond@cumin2001 [production]
11:46 <jbond@cumin2001> START - Cookbook sre.hosts.ipmi-password-reset [production]
11:41 <lucaswerkmeister-wmde@deploy1001> Synchronized wmf-config/Wikibase.php: [[gerrit:542087|Fix typo in beta repo data bridge config (T235033)]] (duration: 00m 59s) [production]
11:40 <marostegui> Deploy schema change on s7 codfw master (db2118), this will generate lag on s7 codfw - T234066 T233135 [production]
11:38 <jbond@cumin2001> END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99) [production]
11:38 <jbond@cumin2001> Updating IPMI password on 1253 hosts - jbond@cumin2001 [production]
11:38 <jbond@cumin2001> START - Cookbook sre.hosts.ipmi-password-reset [production]
11:37 <arturo> icinga downtime cloudvirt1023 for 2h (T227536) [production]
11:36 <arturo> icinga downtime cloudvirt1025 for 2h (T227536) [production]
11:36 <jbond@cumin2001> END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99) [production]
11:36 <jbond@cumin2001> Updating IPMI password on 1253 hosts - jbond@cumin2001 [production]
11:36 <jbond@cumin2001> START - Cookbook sre.hosts.ipmi-password-reset [production]
11:35 <arturo> icinga downtime cloudvirt1026 for 2h (T227536) [production]
11:35 <marostegui> Stop replication on db2077 to change triggers on db2095:3317 - T234704 [production]
11:23 <moritzm> installing reportbug updates from stretch point release [production]
11:22 <Lucas_WMDE> EU SWAT done [production]
11:21 <jbond@cumin2001> END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99) [production]
11:21 <jbond@cumin2001> Updating IPMI password on 1253 hosts - jbond@cumin2001 [production]