2024-05-30
ยง
|
05:17 |
<marostegui> |
Deploy schema changes on old s8 eqiad master (db1209) dbmaint T355609 T356166 |
[production] |
05:16 |
<logmsgbot> |
@deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
05:16 |
<logmsgbot> |
@deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
05:14 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2121 (T366123)', diff saved to https://phabricator.wikimedia.org/P63634 and previous config saved to /var/cache/conftool/dbconfig/20240530-051433-marostegui.json |
[production] |
05:14 |
<logmsgbot> |
@deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
05:13 |
<logmsgbot> |
@deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
05:12 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db2121 (T366123)', diff saved to https://phabricator.wikimedia.org/P63633 and previous config saved to /var/cache/conftool/dbconfig/20240530-051220-marostegui.json |
[production] |
05:12 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance |
[production] |
05:11 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 8:00:00 on db2121.codfw.wmnet with reason: Maintenance |
[production] |
05:11 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depool db1209 T364541', diff saved to https://phabricator.wikimedia.org/P63632 and previous config saved to /var/cache/conftool/dbconfig/20240530-051132-root.json |
[production] |
05:10 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Promote db1192 to s8 primary and set section read-write T364541', diff saved to https://phabricator.wikimedia.org/P63631 and previous config saved to /var/cache/conftool/dbconfig/20240530-051031-marostegui.json |
[production] |
05:10 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Set s8 eqiad as read-only for maintenance - T364541', diff saved to https://phabricator.wikimedia.org/P63630 and previous config saved to /var/cache/conftool/dbconfig/20240530-051012-marostegui.json |
[production] |
05:09 |
<marostegui> |
Starting s8 eqiad failover from db1209 to db1192 - T364541 |
[production] |
05:02 |
<pt1979@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
04:56 |
<logmsgbot> |
@deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
04:56 |
<logmsgbot> |
@deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
04:43 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Remove db1192 from API/vslow/dump T364541', diff saved to https://phabricator.wikimedia.org/P63629 and previous config saved to /var/cache/conftool/dbconfig/20240530-044328-root.json |
[production] |
04:43 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 33 hosts with reason: Primary switchover s8 T364541 |
[production] |
04:42 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Set db1192 with weight 0 T364541', diff saved to https://phabricator.wikimedia.org/P63628 and previous config saved to /var/cache/conftool/dbconfig/20240530-044249-root.json |
[production] |
04:42 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1:00:00 on 33 hosts with reason: Primary switchover s8 T364541 |
[production] |
04:42 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance |
[production] |
04:42 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 8:00:00 on db2098.codfw.wmnet with reason: Maintenance |
[production] |
04:20 |
<logmsgbot> |
@deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
04:20 |
<logmsgbot> |
@deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
04:13 |
<logmsgbot> |
@deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
04:13 |
<logmsgbot> |
@deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
04:11 |
<logmsgbot> |
@deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
04:11 |
<logmsgbot> |
@deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
04:09 |
<logmsgbot> |
@deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
04:08 |
<logmsgbot> |
@deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
04:06 |
<logmsgbot> |
@deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
04:06 |
<logmsgbot> |
@deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
02:59 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2220 (T364299)', diff saved to https://phabricator.wikimedia.org/P63627 and previous config saved to /var/cache/conftool/dbconfig/20240530-025955-marostegui.json |
[production] |
02:44 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P63626 and previous config saved to /var/cache/conftool/dbconfig/20240530-024447-marostegui.json |
[production] |
02:29 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2220', diff saved to https://phabricator.wikimedia.org/P63625 and previous config saved to /var/cache/conftool/dbconfig/20240530-022938-marostegui.json |
[production] |
02:27 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.network.provision (exit_code=0) for device lsw1-c4-codfw.mgmt.codfw.wmnet |
[production] |
02:14 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2220 (T364299)', diff saved to https://phabricator.wikimedia.org/P63624 and previous config saved to /var/cache/conftool/dbconfig/20240530-021430-marostegui.json |
[production] |
01:59 |
<logmsgbot> |
@deploy1002 helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
01:59 |
<logmsgbot> |
@deploy1002 helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply |
[production] |
01:56 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
01:56 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c4-codfw - pt1979@cumin2002" |
[production] |
01:55 |
<pt1979@cumin2002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-c4-codfw - pt1979@cumin2002" |
[production] |
01:48 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Depooling db2138 (T352010)', diff saved to https://phabricator.wikimedia.org/P63623 and previous config saved to /var/cache/conftool/dbconfig/20240530-014850-ladsgroup.json |
[production] |
01:48 |
<ladsgroup@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance |
[production] |
01:48 |
<ladsgroup@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance |
[production] |
01:48 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2126 (T352010)', diff saved to https://phabricator.wikimedia.org/P63622 and previous config saved to /var/cache/conftool/dbconfig/20240530-014827-ladsgroup.json |
[production] |
01:47 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Depooling db2147 (T364069)', diff saved to https://phabricator.wikimedia.org/P63621 and previous config saved to /var/cache/conftool/dbconfig/20240530-014725-marostegui.json |
[production] |
01:47 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance |
[production] |
01:47 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance |
[production] |
01:39 |
<pt1979@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |