5001-5050 of 10000 results (92ms)
2023-01-26 ยง
07:25 <marostegui@deploy1002> Finished scap: Backport for [[gerrit:883699|ProductionServices.php: Depool pc2011 (T327925)]] (duration: 11m 19s) [production]
07:25 <dcausse> T322869: depooling wdqs2009 wdqs2010 wdqs2011 wdqs2012 these hosts should not serve user traffic yet they don't have the database loaded [production]
07:23 <marostegui> Failover m1 from db1195 to db1176 - T327800 [production]
07:20 <marostegui@cumin1001> dbctl commit (dc=all): 'db1198 (re)pooling @ 10%: After DIMM replacement', diff saved to https://phabricator.wikimedia.org/P43356 and previous config saved to /var/cache/conftool/dbconfig/20230126-072017-root.json [production]
07:18 <root@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backup1001.eqiad.wmnet with reason: m1 switchover [production]
07:17 <root@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on backup1001.eqiad.wmnet with reason: m1 switchover [production]
07:17 <root@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on backupmon1001.eqiad.wmnet with reason: m1 switchover [production]
07:17 <root@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on backupmon1001.eqiad.wmnet with reason: m1 switchover [production]
07:16 <marostegui@deploy1002> marostegui: Backport for [[gerrit:883699|ProductionServices.php: Depool pc2011 (T327925)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet [production]
07:14 <marostegui@deploy1002> Started scap: Backport for [[gerrit:883699|ProductionServices.php: Depool pc2011 (T327925)]] [production]
07:12 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2132,2160].codfw.wmnet,db[1117,1176,1195].eqiad.wmnet with reason: Primary switchover m1 T327800 [production]
07:12 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db[2132,2160].codfw.wmnet,db[1117,1176,1195].eqiad.wmnet with reason: Primary switchover m1 T327800 [production]
07:05 <marostegui@cumin1001> dbctl commit (dc=all): 'db1198 (re)pooling @ 5%: After DIMM replacement', diff saved to https://phabricator.wikimedia.org/P43354 and previous config saved to /var/cache/conftool/dbconfig/20230126-070512-root.json [production]
07:02 <marostegui@cumin1001> dbctl commit (dc=all): 'Add some weight to db1103', diff saved to https://phabricator.wikimedia.org/P43353 and previous config saved to /var/cache/conftool/dbconfig/20230126-070220-marostegui.json [production]
07:01 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1120 T327861', diff saved to https://phabricator.wikimedia.org/P43352 and previous config saved to /var/cache/conftool/dbconfig/20230126-070158-root.json [production]
07:00 <marostegui@cumin1001> dbctl commit (dc=all): 'Promote db1103 to x1 primary and set section read-write T327861', diff saved to https://phabricator.wikimedia.org/P43351 and previous config saved to /var/cache/conftool/dbconfig/20230126-070035-marostegui.json [production]
07:00 <marostegui> Starting x1 eqiad failover from db1120 to db1103 - T327861 [production]
06:48 <brett@cumin1001> conftool action : set/pooled=yes; selector: name=cp6015.drmrs.wmnet [production]
06:48 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6015.drmrs.wmnet with OS bullseye [production]
06:32 <ladsgroup@deploy1002> Synchronized private/PrivateSettings.php: Rotating wikiuser password (T326802) (duration: 07m 23s) [production]
06:20 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage [production]
06:18 <brett@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage [production]
06:17 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db1103 with weight 0 T327861', diff saved to https://phabricator.wikimedia.org/P43350 and previous config saved to /var/cache/conftool/dbconfig/20230126-061751-root.json [production]
06:17 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 T327861 [production]
06:16 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 T327861 [production]
05:57 <brett@cumin1001> START - Cookbook sre.hosts.reimage for host cp6015.drmrs.wmnet with OS bullseye [production]
05:53 <brett@cumin1001> conftool action : set/pooled=yes; selector: name=cp6006.drmrs.wmnet [production]
05:53 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6006.drmrs.wmnet with OS bullseye [production]
05:32 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage [production]
05:28 <brett@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage [production]
05:10 <brett@cumin1001> START - Cookbook sre.hosts.reimage for host cp6006.drmrs.wmnet with OS bullseye [production]
05:09 <brett@cumin1001> conftool action : set/pooled=yes; selector: name=cp6014.drmrs.wmnet [production]
05:07 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6014.drmrs.wmnet with OS bullseye [production]
04:45 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage [production]
04:42 <brett@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage [production]
04:24 <brett@cumin1001> START - Cookbook sre.hosts.reimage for host cp6014.drmrs.wmnet with OS bullseye [production]
04:22 <brett@cumin1001> conftool action : set/pooled=yes; selector: name=cp6005.drmrs.wmnet [production]
04:17 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6005.drmrs.wmnet with OS bullseye [production]
03:52 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage [production]
03:49 <brett@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage [production]
03:29 <brett@cumin1001> START - Cookbook sre.hosts.reimage for host cp6005.drmrs.wmnet with OS bullseye [production]
03:27 <brett@cumin1001> conftool action : set/pooled=yes; selector: name=cp6013.drmrs.wmnet [production]
03:27 <ejegg> payments-wiki upgraded from 08b8c3bc to 82d89841 [production]
03:26 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6013.drmrs.wmnet with OS bullseye [production]
03:04 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage [production]
03:01 <brett@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage [production]
02:41 <brett@cumin1001> START - Cookbook sre.hosts.reimage for host cp6013.drmrs.wmnet with OS bullseye [production]
02:30 <sukhe@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2027.codfw.wmnet with OS bullseye [production]
02:17 <sukhe@cumin2002> START - Cookbook sre.hosts.reimage for host cp2027.codfw.wmnet with OS bullseye [production]
02:17 <sukhe@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2027.codfw.wmnet with OS bullseye [production]