2023-01-26
ยง
|
08:17 |
<marostegui> |
Starting s2 codfw failover from db2104 to db2107 - T327998 |
[production] |
08:17 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db2103 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P43365 and previous config saved to /var/cache/conftool/dbconfig/20230126-081738-root.json |
[production] |
08:05 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1198 (re)pooling @ 75%: After DIMM replacement', diff saved to https://phabricator.wikimedia.org/P43364 and previous config saved to /var/cache/conftool/dbconfig/20230126-080533-root.json |
[production] |
08:05 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s2 T327998 |
[production] |
08:04 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s2 T327998 |
[production] |
08:04 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set db2107 with weight 0 T327998', diff saved to https://phabricator.wikimedia.org/P43363 and previous config saved to /var/cache/conftool/dbconfig/20230126-080427-root.json |
[production] |
08:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db2103 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P43362 and previous config saved to /var/cache/conftool/dbconfig/20230126-080233-root.json |
[production] |
08:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db2103 T327997', diff saved to https://phabricator.wikimedia.org/P43361 and previous config saved to /var/cache/conftool/dbconfig/20230126-080159-root.json |
[production] |
08:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Promote db2112 to s1 primary T327997', diff saved to https://phabricator.wikimedia.org/P43360 and previous config saved to /var/cache/conftool/dbconfig/20230126-080033-root.json |
[production] |
08:00 |
<marostegui> |
Starting s1 codfw failover from db2103 to db2112 - T327997 |
[production] |
07:50 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1198 (re)pooling @ 50%: After DIMM replacement', diff saved to https://phabricator.wikimedia.org/P43359 and previous config saved to /var/cache/conftool/dbconfig/20230126-075028-root.json |
[production] |
07:49 |
<ryankemper@puppetmaster1001> |
conftool action : set/weight=10:pooled=inactive; selector: name=wdqs2012.* |
[production] |
07:49 |
<ryankemper@puppetmaster1001> |
conftool action : set/weight=10:pooled=inactive; selector: name=wdqs2011.* |
[production] |
07:49 |
<ryankemper@puppetmaster1001> |
conftool action : set/weight=10:pooled=inactive; selector: name=wdqs2010.* |
[production] |
07:48 |
<ryankemper@puppetmaster1001> |
conftool action : set/weight=10:pooled=inactive; selector: name=wdqs2009.* |
[production] |
07:36 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set db2112 with weight 0 T327997', diff saved to https://phabricator.wikimedia.org/P43358 and previous config saved to /var/cache/conftool/dbconfig/20230126-073616-root.json |
[production] |
07:36 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 38 hosts with reason: Primary switchover s1 T327997 |
[production] |
07:35 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 38 hosts with reason: Primary switchover s1 T327997 |
[production] |
07:35 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1198 (re)pooling @ 25%: After DIMM replacement', diff saved to https://phabricator.wikimedia.org/P43357 and previous config saved to /var/cache/conftool/dbconfig/20230126-073523-root.json |
[production] |
07:25 |
<marostegui@deploy1002> |
Finished scap: Backport for [[gerrit:883699|ProductionServices.php: Depool pc2011 (T327925)]] (duration: 11m 19s) |
[production] |
07:25 |
<dcausse> |
T322869: depooling wdqs2009 wdqs2010 wdqs2011 wdqs2012 these hosts should not serve user traffic yet they don't have the database loaded |
[production] |
07:23 |
<marostegui> |
Failover m1 from db1195 to db1176 - T327800 |
[production] |
07:20 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1198 (re)pooling @ 10%: After DIMM replacement', diff saved to https://phabricator.wikimedia.org/P43356 and previous config saved to /var/cache/conftool/dbconfig/20230126-072017-root.json |
[production] |
07:18 |
<root@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup1001.eqiad.wmnet with reason: m1 switchover |
[production] |
07:17 |
<root@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on backup1001.eqiad.wmnet with reason: m1 switchover |
[production] |
07:17 |
<root@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backupmon1001.eqiad.wmnet with reason: m1 switchover |
[production] |
07:17 |
<root@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on backupmon1001.eqiad.wmnet with reason: m1 switchover |
[production] |
07:16 |
<marostegui@deploy1002> |
marostegui: Backport for [[gerrit:883699|ProductionServices.php: Depool pc2011 (T327925)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet |
[production] |
07:14 |
<marostegui@deploy1002> |
Started scap: Backport for [[gerrit:883699|ProductionServices.php: Depool pc2011 (T327925)]] |
[production] |
07:12 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2132,2160].codfw.wmnet,db[1117,1176,1195].eqiad.wmnet with reason: Primary switchover m1 T327800 |
[production] |
07:12 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db[2132,2160].codfw.wmnet,db[1117,1176,1195].eqiad.wmnet with reason: Primary switchover m1 T327800 |
[production] |
07:05 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1198 (re)pooling @ 5%: After DIMM replacement', diff saved to https://phabricator.wikimedia.org/P43354 and previous config saved to /var/cache/conftool/dbconfig/20230126-070512-root.json |
[production] |
07:02 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Add some weight to db1103', diff saved to https://phabricator.wikimedia.org/P43353 and previous config saved to /var/cache/conftool/dbconfig/20230126-070220-marostegui.json |
[production] |
07:01 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1120 T327861', diff saved to https://phabricator.wikimedia.org/P43352 and previous config saved to /var/cache/conftool/dbconfig/20230126-070158-root.json |
[production] |
07:00 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Promote db1103 to x1 primary and set section read-write T327861', diff saved to https://phabricator.wikimedia.org/P43351 and previous config saved to /var/cache/conftool/dbconfig/20230126-070035-marostegui.json |
[production] |
07:00 |
<marostegui> |
Starting x1 eqiad failover from db1120 to db1103 - T327861 |
[production] |
06:48 |
<brett@cumin1001> |
conftool action : set/pooled=yes; selector: name=cp6015.drmrs.wmnet |
[production] |
06:48 |
<brett@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6015.drmrs.wmnet with OS bullseye |
[production] |
06:32 |
<ladsgroup@deploy1002> |
Synchronized private/PrivateSettings.php: Rotating wikiuser password (T326802) (duration: 07m 23s) |
[production] |
06:20 |
<brett@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage |
[production] |
06:18 |
<brett@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage |
[production] |
06:17 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Set db1103 with weight 0 T327861', diff saved to https://phabricator.wikimedia.org/P43350 and previous config saved to /var/cache/conftool/dbconfig/20230126-061751-root.json |
[production] |
06:17 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 T327861 |
[production] |
06:16 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 T327861 |
[production] |
05:57 |
<brett@cumin1001> |
START - Cookbook sre.hosts.reimage for host cp6015.drmrs.wmnet with OS bullseye |
[production] |
05:53 |
<brett@cumin1001> |
conftool action : set/pooled=yes; selector: name=cp6006.drmrs.wmnet |
[production] |
05:53 |
<brett@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6006.drmrs.wmnet with OS bullseye |
[production] |
05:32 |
<brett@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage |
[production] |
05:28 |
<brett@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage |
[production] |
05:10 |
<brett@cumin1001> |
START - Cookbook sre.hosts.reimage for host cp6006.drmrs.wmnet with OS bullseye |
[production] |