551-600 of 10000 results (68ms)
2023-01-26 ยง
07:00 <marostegui@cumin1001> dbctl commit (dc=all): 'Promote db1103 to x1 primary and set section read-write T327861', diff saved to https://phabricator.wikimedia.org/P43351 and previous config saved to /var/cache/conftool/dbconfig/20230126-070035-marostegui.json [production]
07:00 <marostegui> Starting x1 eqiad failover from db1120 to db1103 - T327861 [production]
06:48 <brett@cumin1001> conftool action : set/pooled=yes; selector: name=cp6015.drmrs.wmnet [production]
06:48 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6015.drmrs.wmnet with OS bullseye [production]
06:32 <ladsgroup@deploy1002> Synchronized private/PrivateSettings.php: Rotating wikiuser password (T326802) (duration: 07m 23s) [production]
06:20 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage [production]
06:18 <brett@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage [production]
06:17 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db1103 with weight 0 T327861', diff saved to https://phabricator.wikimedia.org/P43350 and previous config saved to /var/cache/conftool/dbconfig/20230126-061751-root.json [production]
06:17 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 T327861 [production]
06:16 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 T327861 [production]
05:57 <brett@cumin1001> START - Cookbook sre.hosts.reimage for host cp6015.drmrs.wmnet with OS bullseye [production]
05:53 <brett@cumin1001> conftool action : set/pooled=yes; selector: name=cp6006.drmrs.wmnet [production]
05:53 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6006.drmrs.wmnet with OS bullseye [production]
05:32 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage [production]
05:28 <brett@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage [production]
05:10 <brett@cumin1001> START - Cookbook sre.hosts.reimage for host cp6006.drmrs.wmnet with OS bullseye [production]
05:09 <brett@cumin1001> conftool action : set/pooled=yes; selector: name=cp6014.drmrs.wmnet [production]
05:07 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6014.drmrs.wmnet with OS bullseye [production]
04:45 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage [production]
04:42 <brett@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage [production]
04:24 <brett@cumin1001> START - Cookbook sre.hosts.reimage for host cp6014.drmrs.wmnet with OS bullseye [production]
04:22 <brett@cumin1001> conftool action : set/pooled=yes; selector: name=cp6005.drmrs.wmnet [production]
04:17 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6005.drmrs.wmnet with OS bullseye [production]
03:52 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage [production]
03:49 <brett@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage [production]
03:29 <brett@cumin1001> START - Cookbook sre.hosts.reimage for host cp6005.drmrs.wmnet with OS bullseye [production]
03:27 <brett@cumin1001> conftool action : set/pooled=yes; selector: name=cp6013.drmrs.wmnet [production]
03:27 <ejegg> payments-wiki upgraded from 08b8c3bc to 82d89841 [production]
03:26 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6013.drmrs.wmnet with OS bullseye [production]
03:04 <brett@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage [production]
03:01 <brett@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage [production]
02:41 <brett@cumin1001> START - Cookbook sre.hosts.reimage for host cp6013.drmrs.wmnet with OS bullseye [production]
02:30 <sukhe@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2027.codfw.wmnet with OS bullseye [production]
02:17 <sukhe@cumin2002> START - Cookbook sre.hosts.reimage for host cp2027.codfw.wmnet with OS bullseye [production]
02:17 <sukhe@cumin2002> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2027.codfw.wmnet with OS bullseye [production]
01:58 <ejegg> restarted fundraising scheduled jobs after queue server reboot [production]
01:53 <sukhe@cumin2002> START - Cookbook sre.hosts.reimage for host cp2027.codfw.wmnet with OS bullseye [production]
01:49 <sukhe@puppetmaster1001> conftool action : set/pooled=yes; selector: name=cp2028.codfw.wmnet,service=ats-be [production]
01:49 <sukhe@puppetmaster1001> conftool action : set/pooled=yes; selector: name=cp2028.codfw.wmnet,service=cdn [production]
01:48 <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2027.codfw.wmnet with reason: firmware test [production]
01:48 <sukhe@cumin2002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp2027.codfw.wmnet with reason: firmware test [production]
01:46 <sukhe@puppetmaster1001> conftool action : set/pooled=no; selector: name=cp2027.codfw.wmnet,service=ats-be [production]
01:46 <sukhe@puppetmaster1001> conftool action : set/pooled=no; selector: name=cp2027.codfw.wmnet,service=cdn [production]
01:46 <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2028.codfw.wmnet with OS bullseye [production]
01:24 <ejegg> payments-wiki upgraded from 15395d05 to 08b8c3bc (upgraded from MW 1.35 to MW 1.39) [production]
01:23 <sukhe@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2028.codfw.wmnet with reason: host reimage [production]
01:20 <sukhe@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on cp2028.codfw.wmnet with reason: host reimage [production]
01:19 <eevans@cumin1001> END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching cassandra-dev2*: Enable internode encryption - eevans@cumin1001 [production]
01:14 <ejegg> disabled fundraising scheduled jobs for queue server reboot [production]
01:05 <sukhe@cumin2002> START - Cookbook sre.hosts.reimage for host cp2028.codfw.wmnet with OS bullseye [production]