2022-11-30
ยง
|
23:24 |
<pt1979@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1206'] |
[production] |
23:24 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1206.mgmt.eqiad.wmnet with reboot policy FORCED |
[production] |
23:22 |
<brett@cumin1001> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5025.eqsin.wmnet with OS buster |
[production] |
23:18 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P41959 and previous config saved to /var/cache/conftool/dbconfig/20221130-231808-ladsgroup.json |
[production] |
23:11 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P41958 and previous config saved to /var/cache/conftool/dbconfig/20221130-231130-ladsgroup.json |
[production] |
23:06 |
<tgr> |
running GrowthExperiments refreshUserImpactData.php (and generating a bunch of AQS requests) for T323958 |
[production] |
23:03 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1118 (T322618)', diff saved to https://phabricator.wikimedia.org/P41957 and previous config saved to /var/cache/conftool/dbconfig/20221130-230301-ladsgroup.json |
[production] |
23:01 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1118 (T322618)', diff saved to https://phabricator.wikimedia.org/P41956 and previous config saved to /var/cache/conftool/dbconfig/20221130-230154-ladsgroup.json |
[production] |
23:01 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance |
[production] |
23:01 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db1118.eqiad.wmnet with reason: Maintenance |
[production] |
23:01 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1107 (T322618)', diff saved to https://phabricator.wikimedia.org/P41955 and previous config saved to /var/cache/conftool/dbconfig/20221130-230132-ladsgroup.json |
[production] |
22:57 |
<tgr> |
UTC late deploys done |
[production] |
22:56 |
<tgr@deploy1002> |
Finished scap: Backport for [[gerrit:862352|Use the right load balancer for UserImpactStore]] (duration: 07m 15s) |
[production] |
22:56 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P41954 and previous config saved to /var/cache/conftool/dbconfig/20221130-225623-ladsgroup.json |
[production] |
22:50 |
<tgr@deploy1002> |
tgr and tgr: Backport for [[gerrit:862352|Use the right load balancer for UserImpactStore]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet |
[production] |
22:50 |
<brett@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5025.eqsin.wmnet with reason: host reimage |
[production] |
22:49 |
<tgr@deploy1002> |
Started scap: Backport for [[gerrit:862352|Use the right load balancer for UserImpactStore]] |
[production] |
22:46 |
<brett@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cp5025.eqsin.wmnet with reason: host reimage |
[production] |
22:46 |
<tgr@deploy1002> |
Finished scap: Backport for [[gerrit:862351|Use the right load balancer for UserImpactStore]] (duration: 05m 59s) |
[production] |
22:46 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P41953 and previous config saved to /var/cache/conftool/dbconfig/20221130-224626-ladsgroup.json |
[production] |
22:42 |
<pt1979@cumin2002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest2001'] |
[production] |
22:42 |
<pt1979@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest2001'] |
[production] |
22:41 |
<tgr@deploy1002> |
tgr and tgr: Backport for [[gerrit:862351|Use the right load balancer for UserImpactStore]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet |
[production] |
22:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2145 (T322618)', diff saved to https://phabricator.wikimedia.org/P41952 and previous config saved to /var/cache/conftool/dbconfig/20221130-224117-ladsgroup.json |
[production] |
22:40 |
<tgr@deploy1002> |
Started scap: Backport for [[gerrit:862351|Use the right load balancer for UserImpactStore]] |
[production] |
22:39 |
<tgr@deploy1002> |
Finished scap: Backport for [[gerrit:862346|NewImpact: wrap thanks count in a link to Thanks Log (T324087)]] (duration: 06m 42s) |
[production] |
22:39 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2145 (T322618)', diff saved to https://phabricator.wikimedia.org/P41951 and previous config saved to /var/cache/conftool/dbconfig/20221130-223907-ladsgroup.json |
[production] |
22:39 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2145.codfw.wmnet with reason: Maintenance |
[production] |
22:38 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2145.codfw.wmnet with reason: Maintenance |
[production] |
22:38 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2130 (T322618)', diff saved to https://phabricator.wikimedia.org/P41950 and previous config saved to /var/cache/conftool/dbconfig/20221130-223845-ladsgroup.json |
[production] |
22:37 |
<pt1979@cumin2002> |
END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['sretest2001'] |
[production] |
22:37 |
<pt1979@cumin2002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['sretest2001'] |
[production] |
22:36 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['sretest2001'] |
[production] |
22:34 |
<tgr@deploy1002> |
tgr and kharlan: Backport for [[gerrit:862346|NewImpact: wrap thanks count in a link to Thanks Log (T324087)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet |
[production] |
22:33 |
<tgr@deploy1002> |
Started scap: Backport for [[gerrit:862346|NewImpact: wrap thanks count in a link to Thanks Log (T324087)]] |
[production] |
22:31 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P41949 and previous config saved to /var/cache/conftool/dbconfig/20221130-223119-ladsgroup.json |
[production] |
22:29 |
<pt1979@cumin2002> |
START - Cookbook sre.hosts.provision for host db1206.mgmt.eqiad.wmnet with reboot policy FORCED |
[production] |
22:28 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
22:28 |
<pt1979@cumin2002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for db1206 - pt1979@cumin2002" |
[production] |
22:27 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2175 (T323907)', diff saved to https://phabricator.wikimedia.org/P41948 and previous config saved to /var/cache/conftool/dbconfig/20221130-222706-ladsgroup.json |
[production] |
22:26 |
<pt1979@cumin2002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add DNS for db1206 - pt1979@cumin2002" |
[production] |
22:24 |
<pt1979@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
22:23 |
<ebernhardson@deploy1002> |
Finished deploy [wikimedia/discovery/analytics@3287124]: set mjolnir max_active_runs to 1 (duration: 02m 28s) |
[production] |
22:23 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P41947 and previous config saved to /var/cache/conftool/dbconfig/20221130-222339-ladsgroup.json |
[production] |
22:21 |
<ebernhardson@deploy1002> |
Started deploy [wikimedia/discovery/analytics@3287124]: set mjolnir max_active_runs to 1 |
[production] |
22:19 |
<robh@cumin2002> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['lvs5004'] |
[production] |
22:18 |
<brett@cumin1001> |
START - Cookbook sre.hosts.reimage for host cp5025.eqsin.wmnet with OS buster |
[production] |
22:16 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1107 (T322618)', diff saved to https://phabricator.wikimedia.org/P41946 and previous config saved to /var/cache/conftool/dbconfig/20221130-221613-ladsgroup.json |
[production] |
22:15 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1107 (T322618)', diff saved to https://phabricator.wikimedia.org/P41945 and previous config saved to /var/cache/conftool/dbconfig/20221130-221505-ladsgroup.json |
[production] |
22:14 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1107.eqiad.wmnet with reason: Maintenance |
[production] |