2019-10-10
ยง
|
14:57 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1074 after getting its BBU replaced T231638', diff saved to https://phabricator.wikimedia.org/P9306 and previous config saved to /var/cache/conftool/dbconfig/20191010-145737-marostegui.json |
[production] |
14:54 |
<moritzm> |
ran systemctl reset-failed on puppetmaster1001 (puppet-master.service after reimage) |
[production] |
14:42 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1074 after BBU replacement T231638', diff saved to https://phabricator.wikimedia.org/P9305 and previous config saved to /var/cache/conftool/dbconfig/20191010-144201-marostegui.json |
[production] |
14:39 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1112 into recentchanges and remove db1078 from it after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9304 and previous config saved to /var/cache/conftool/dbconfig/20191010-143924-marostegui.json |
[production] |
14:36 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool to db1084 db1083 db1076 db1112 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9303 and previous config saved to /var/cache/conftool/dbconfig/20191010-143633-marostegui.json |
[production] |
14:23 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'More traffic to db1084 db1083 db1076 db1112 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9302 and previous config saved to /var/cache/conftool/dbconfig/20191010-142323-marostegui.json |
[production] |
14:13 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1084 db1083 db1076 db1112 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9301 and previous config saved to /var/cache/conftool/dbconfig/20191010-141303-marostegui.json |
[production] |
14:04 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Fully repool es1013, es1014 after PDU maintenance (duration: 00m 59s) |
[production] |
14:03 |
<jbond42> |
re-enable puppet now ca has been correctly moved |
[production] |
13:58 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1112 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9300 and previous config saved to /var/cache/conftool/dbconfig/20191010-135806-marostegui.json |
[production] |
13:57 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1084 db1083 db1076 db1118 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9299 and previous config saved to /var/cache/conftool/dbconfig/20191010-135659-marostegui.json |
[production] |
13:50 |
<jbond42> |
disable puppet fleet wide as puppetmaster2002 is stuggeling |
[production] |
13:32 |
<jbond42> |
reimage puppetmaster1001 |
[production] |
13:27 |
<marostegui> |
Repool labsdb1011 after reclone - T235016 |
[production] |
13:16 |
<arturo> |
added flannel 0.5.5-4 to buster-wikimedia (T235059) |
[production] |
13:05 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: More traffic to es1013, es1014 after PDU maintenance (duration: 00m 58s) |
[production] |
13:00 |
<jbond@cumin2001> |
END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0) |
[production] |
12:41 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Slowly repool es1013, es1014 after PDU maintenance (duration: 00m 59s) |
[production] |
11:57 |
<jbond@cumin2001> |
Updating IPMI password on 1253 hosts - jbond@cumin2001 |
[production] |
11:57 |
<jbond@cumin2001> |
START - Cookbook sre.hosts.ipmi-password-reset |
[production] |
11:48 |
<jbond@cumin2001> |
END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0) |
[production] |
11:46 |
<jbond@cumin2001> |
Updating IPMI password on 35 hosts - jbond@cumin2001 |
[production] |
11:46 |
<jbond@cumin2001> |
START - Cookbook sre.hosts.ipmi-password-reset |
[production] |
11:41 |
<lucaswerkmeister-wmde@deploy1001> |
Synchronized wmf-config/Wikibase.php: [[gerrit:542087|Fix typo in beta repo data bridge config (T235033)]] (duration: 00m 59s) |
[production] |
11:40 |
<marostegui> |
Deploy schema change on s7 codfw master (db2118), this will generate lag on s7 codfw - T234066 T233135 |
[production] |
11:38 |
<jbond@cumin2001> |
END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99) |
[production] |
11:38 |
<jbond@cumin2001> |
Updating IPMI password on 1253 hosts - jbond@cumin2001 |
[production] |
11:38 |
<jbond@cumin2001> |
START - Cookbook sre.hosts.ipmi-password-reset |
[production] |
11:37 |
<arturo> |
icinga downtime cloudvirt1023 for 2h (T227536) |
[production] |
11:36 |
<arturo> |
icinga downtime cloudvirt1025 for 2h (T227536) |
[production] |
11:36 |
<jbond@cumin2001> |
END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99) |
[production] |
11:36 |
<jbond@cumin2001> |
Updating IPMI password on 1253 hosts - jbond@cumin2001 |
[production] |
11:36 |
<jbond@cumin2001> |
START - Cookbook sre.hosts.ipmi-password-reset |
[production] |
11:35 |
<arturo> |
icinga downtime cloudvirt1026 for 2h (T227536) |
[production] |
11:35 |
<marostegui> |
Stop replication on db2077 to change triggers on db2095:3317 - T234704 |
[production] |
11:23 |
<moritzm> |
installing reportbug updates from stretch point release |
[production] |
11:22 |
<Lucas_WMDE> |
EU SWAT done |
[production] |
11:21 |
<jbond@cumin2001> |
END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99) |
[production] |
11:21 |
<jbond@cumin2001> |
Updating IPMI password on 1253 hosts - jbond@cumin2001 |
[production] |
11:21 |
<jbond@cumin2001> |
START - Cookbook sre.hosts.ipmi-password-reset |
[production] |
11:21 |
<lucaswerkmeister-wmde@deploy1001> |
Synchronized wmf-config/: SWAT: [[gerrit:542081|Set dataBridgeEnabled repo setting on beta (T235033)]] (affects InitialiseSettings-labs.php and Wikibase.php, but Wikibase.php part is guarded by isset(), so should be safe to sync both at once, I think) (duration: 01m 00s) |
[production] |
11:21 |
<jbond@cumin2001> |
END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99) |
[production] |
11:21 |
<jbond@cumin2001> |
START - Cookbook sre.hosts.ipmi-password-reset |
[production] |
11:14 |
<Lucas_WMDE> |
^ (and by CS, I actually mean Wikibase.php, not CommonSettings.php, sorry) |
[production] |
11:13 |
<lucaswerkmeister-wmde@deploy1001> |
Synchronized wmf-config/: SWAT: [[gerrit:542080|Rename data bridge config variable names (T235033)]] (affects IS-labs and CS, but the CS part is all guarded by isset(), so should be safe to sync both at once, I think) (duration: 01m 00s) |
[production] |
10:38 |
<moritzm> |
rebalancing Ganeti eqiad/row C after rolling reboots of Ganeti nodes |
[production] |
10:34 |
<volans> |
uploaded spicerack_0.0.28-1_amd64.deb to apt.wikimedia.org stretch-wikimedia |
[production] |
08:23 |
<@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'restrouter' for release 'production' . |
[production] |
08:20 |
<@> |
helmfile [CODFW] Ran 'apply' command on namespace 'restrouter' for release 'production' . |
[production] |
08:17 |
<@> |
helmfile [STAGING] Ran 'apply' command on namespace 'restrouter' for release 'staging' . |
[production] |