1151-1200 of 10000 results (83ms)
2023-01-26 ยง
08:17 <marostegui> Starting s2 codfw failover from db2104 to db2107 - T327998 [production]
08:17 <marostegui@cumin1001> dbctl commit (dc=all): 'db2103 (re)pooling @ 5%: After switchover', diff saved to https://phabricator.wikimedia.org/P43365 and previous config saved to /var/cache/conftool/dbconfig/20230126-081738-root.json [production]
08:05 <marostegui@cumin1001> dbctl commit (dc=all): 'db1198 (re)pooling @ 75%: After DIMM replacement', diff saved to https://phabricator.wikimedia.org/P43364 and previous config saved to /var/cache/conftool/dbconfig/20230126-080533-root.json [production]
08:05 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 27 hosts with reason: Primary switchover s2 T327998 [production]
08:04 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 27 hosts with reason: Primary switchover s2 T327998 [production]
08:04 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db2107 with weight 0 T327998', diff saved to https://phabricator.wikimedia.org/P43363 and previous config saved to /var/cache/conftool/dbconfig/20230126-080427-root.json [production]
08:02 <marostegui@cumin1001> dbctl commit (dc=all): 'db2103 (re)pooling @ 1%: After switchover', diff saved to https://phabricator.wikimedia.org/P43362 and previous config saved to /var/cache/conftool/dbconfig/20230126-080233-root.json [production]
08:02 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2103 T327997', diff saved to https://phabricator.wikimedia.org/P43361 and previous config saved to /var/cache/conftool/dbconfig/20230126-080159-root.json [production]
08:00 <marostegui@cumin1001> dbctl commit (dc=all): 'Promote db2112 to s1 primary T327997', diff saved to https://phabricator.wikimedia.org/P43360 and previous config saved to /var/cache/conftool/dbconfig/20230126-080033-root.json [production]
08:00 <marostegui> Starting s1 codfw failover from db2103 to db2112 - T327997 [production]
07:50 <marostegui@cumin1001> dbctl commit (dc=all): 'db1198 (re)pooling @ 50%: After DIMM replacement', diff saved to https://phabricator.wikimedia.org/P43359 and previous config saved to /var/cache/conftool/dbconfig/20230126-075028-root.json [production]
07:49 <ryankemper@puppetmaster1001> conftool action : set/weight=10:pooled=inactive; selector: name=wdqs2012.* [production]
07:49 <ryankemper@puppetmaster1001> conftool action : set/weight=10:pooled=inactive; selector: name=wdqs2011.* [production]
07:49 <ryankemper@puppetmaster1001> conftool action : set/weight=10:pooled=inactive; selector: name=wdqs2010.* [production]
07:48 <ryankemper@puppetmaster1001> conftool action : set/weight=10:pooled=inactive; selector: name=wdqs2009.* [production]
07:36 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db2112 with weight 0 T327997', diff saved to https://phabricator.wikimedia.org/P43358 and previous config saved to /var/cache/conftool/dbconfig/20230126-073616-root.json [production]
07:36 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 38 hosts with reason: Primary switchover s1 T327997 [production]
07:35 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 38 hosts with reason: Primary switchover s1 T327997 [production]
07:35 <marostegui@cumin1001> dbctl commit (dc=all): 'db1198 (re)pooling @ 25%: After DIMM replacement', diff saved to https://phabricator.wikimedia.org/P43357 and previous config saved to /var/cache/conftool/dbconfig/20230126-073523-root.json [production]
07:25 <marostegui@deploy1002> Finished scap: Backport for [[gerrit:883699|ProductionServices.php: Depool pc2011 (T327925)]] (duration: 11m 19s) [production]
07:25 <dcausse> T322869: depooling wdqs2009 wdqs2010 wdqs2011 wdqs2012 these hosts should not serve user traffic yet they don't have the database loaded [production]
07:23 <marostegui> Failover m1 from db1195 to db1176 - T327800 [production]
07:20 <marostegui@cumin1001> dbctl commit (dc=all): 'db1198 (re)pooling @ 10%: After DIMM replacement', diff saved to https://phabricator.wikimedia.org/P43356 and previous config saved to /var/cache/conftool/dbconfig/20230126-072017-root.json [production]
07:18 <root@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup1001.eqiad.wmnet with reason: m1 switchover [production]
07:17 <root@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on backup1001.eqiad.wmnet with reason: m1 switchover [production]
07:17 <root@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backupmon1001.eqiad.wmnet with reason: m1 switchover [production]
07:17 <root@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on backupmon1001.eqiad.wmnet with reason: m1 switchover [production]
07:16 <marostegui@deploy1002> marostegui: Backport for [[gerrit:883699|ProductionServices.php: Depool pc2011 (T327925)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet [production]
07:14 <marostegui@deploy1002> Started scap: Backport for [[gerrit:883699|ProductionServices.php: Depool pc2011 (T327925)]] [production]
07:12 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2132,2160].codfw.wmnet,db[1117,1176,1195].eqiad.wmnet with reason: Primary switchover m1 T327800 [production]
07:12 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db[2132,2160].codfw.wmnet,db[1117,1176,1195].eqiad.wmnet with reason: Primary switchover m1 T327800 [production]
07:05 <marostegui@cumin1001> dbctl commit (dc=all): 'db1198 (re)pooling @ 5%: After DIMM replacement', diff saved to https://phabricator.wikimedia.org/P43354 and previous config saved to /var/cache/conftool/dbconfig/20230126-070512-root.json [production]
07:02 <marostegui@cumin1001> dbctl commit (dc=all): 'Add some weight to db1103', diff saved to https://phabricator.wikimedia.org/P43353 and previous config saved to /var/cache/conftool/dbconfig/20230126-070220-marostegui.json [production]
07:01 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1120 T327861', diff saved to https://phabricator.wikimedia.org/P43352 and previous config saved to /var/cache/conftool/dbconfig/20230126-070158-root.json [production]
07:00 <marostegui@cumin1001> dbctl commit (dc=all): 'Promote db1103 to x1 primary and set section read-write T327861', diff saved to https://phabricator.wikimedia.org/P43351 and previous config saved to /var/cache/conftool/dbconfig/20230126-070035-marostegui.json [production]
07:00 <marostegui> Starting x1 eqiad failover from db1120 to db1103 - T327861 [production]
06:48 <brett@cumin1001> conftool action : set/pooled=yes; selector: name=cp6015.drmrs.wmnet [production]
06:48 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6015.drmrs.wmnet with OS bullseye [production]
06:32 <ladsgroup@deploy1002> Synchronized private/PrivateSettings.php: Rotating wikiuser password (T326802) (duration: 07m 23s) [production]
06:20 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage [production]
06:18 <brett@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage [production]
06:17 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db1103 with weight 0 T327861', diff saved to https://phabricator.wikimedia.org/P43350 and previous config saved to /var/cache/conftool/dbconfig/20230126-061751-root.json [production]
06:17 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 T327861 [production]
06:16 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 T327861 [production]
05:57 <brett@cumin1001> START - Cookbook sre.hosts.reimage for host cp6015.drmrs.wmnet with OS bullseye [production]
05:53 <brett@cumin1001> conftool action : set/pooled=yes; selector: name=cp6006.drmrs.wmnet [production]
05:53 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6006.drmrs.wmnet with OS bullseye [production]
05:32 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage [production]
05:28 <brett@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage [production]
05:10 <brett@cumin1001> START - Cookbook sre.hosts.reimage for host cp6006.drmrs.wmnet with OS bullseye [production]