2023-05-09
ยง
|
13:49 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1189 (T335845)', diff saved to https://phabricator.wikimedia.org/P48011 and previous config saved to /var/cache/conftool/dbconfig/20230509-134929-ladsgroup.json |
[production] |
13:49 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db2180 T336031', diff saved to https://phabricator.wikimedia.org/P48010 and previous config saved to /var/cache/conftool/dbconfig/20230509-134921-root.json |
[production] |
13:44 |
<moritzm> |
rearmed keyholder on netmon* post reboot |
[production] |
13:43 |
<taavi@deploy1002> |
taavi: Backport for [[gerrit:910768|Add $wmgUseRealMe (T324535)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet |
[production] |
13:42 |
<sukhe> |
sudo cumin -b1 -s1200 'A:cp and A:esams' 'varnish-frontend-restart: T253093 |
[production] |
13:42 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P48009 and previous config saved to /var/cache/conftool/dbconfig/20230509-134244-ladsgroup.json |
[production] |
13:42 |
<taavi@deploy1002> |
Started scap: Backport for [[gerrit:910768|Add $wmgUseRealMe (T324535)]] |
[production] |
13:38 |
<taavi@deploy1002> |
Finished scap: Backport for [[gerrit:910767|Add RealMe to extension-list (T324535)]] (duration: 35m 47s) |
[production] |
13:34 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P48008 and previous config saved to /var/cache/conftool/dbconfig/20230509-133416-ladsgroup.json |
[production] |
13:28 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on an-worker1088.eqiad.wmnet with reason: Replacing RAID controller battery |
[production] |
13:28 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-client1001.eqiad.wmnet |
[production] |
13:28 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.downtime for 4:00:00 on an-worker1088.eqiad.wmnet with reason: Replacing RAID controller battery |
[production] |
13:27 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P48007 and previous config saved to /var/cache/conftool/dbconfig/20230509-132737-ladsgroup.json |
[production] |
13:27 |
<moritzm> |
updated bookworm d-i image to 2022-05-09 daily build T330495 |
[production] |
13:23 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host an-test-client1001.eqiad.wmnet |
[production] |
13:23 |
<taavi@deploy1002> |
taavi: Backport for [[gerrit:910767|Add RealMe to extension-list (T324535)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet |
[production] |
13:23 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for an-worker1088.eqiad.wmnet |
[production] |
13:23 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.remove-downtime for an-worker1088.eqiad.wmnet |
[production] |
13:19 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P48006 and previous config saved to /var/cache/conftool/dbconfig/20230509-131910-ladsgroup.json |
[production] |
13:12 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2146 (T335845)', diff saved to https://phabricator.wikimedia.org/P48005 and previous config saved to /var/cache/conftool/dbconfig/20230509-131231-ladsgroup.json |
[production] |
13:05 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2146 (T335845)', diff saved to https://phabricator.wikimedia.org/P48004 and previous config saved to /var/cache/conftool/dbconfig/20230509-130524-ladsgroup.json |
[production] |
13:05 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2146.codfw.wmnet with reason: Maintenance |
[production] |
13:05 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2146.codfw.wmnet with reason: Maintenance |
[production] |
13:05 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2145 (T335845)', diff saved to https://phabricator.wikimedia.org/P48003 and previous config saved to /var/cache/conftool/dbconfig/20230509-130459-ladsgroup.json |
[production] |
13:04 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync after adding ldap-rw servers - jmm@cumin2002" |
[production] |
13:04 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1189 (T335845)', diff saved to https://phabricator.wikimedia.org/P48002 and previous config saved to /var/cache/conftool/dbconfig/20230509-130404-ladsgroup.json |
[production] |
13:02 |
<taavi@deploy1002> |
Started scap: Backport for [[gerrit:910767|Add RealMe to extension-list (T324535)]] |
[production] |
13:01 |
<jmm@cumin2002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync after adding ldap-rw servers - jmm@cumin2002" |
[production] |
12:58 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on an-worker1088.eqiad.wmnet with reason: Upgrading RAID controller firmware |
[production] |
12:58 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on an-worker1088.eqiad.wmnet with reason: Upgrading RAID controller firmware |
[production] |
12:58 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.ganeti.reimage (exit_code=0) for host ldap-rw2001.wikimedia.org with OS bullseye |
[production] |
12:56 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db1189 (T335845)', diff saved to https://phabricator.wikimedia.org/P48001 and previous config saved to /var/cache/conftool/dbconfig/20230509-125644-ladsgroup.json |
[production] |
12:56 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance |
[production] |
12:56 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance |
[production] |
12:56 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1175 (T335845)', diff saved to https://phabricator.wikimedia.org/P48000 and previous config saved to /var/cache/conftool/dbconfig/20230509-125620-ladsgroup.json |
[production] |
12:49 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P47999 and previous config saved to /var/cache/conftool/dbconfig/20230509-124953-ladsgroup.json |
[production] |
12:45 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ldap-rw2001.wikimedia.org with reason: host reimage |
[production] |
12:41 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on ldap-rw2001.wikimedia.org with reason: host reimage |
[production] |
12:41 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P47997 and previous config saved to /var/cache/conftool/dbconfig/20230509-124114-ladsgroup.json |
[production] |
12:34 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P47996 and previous config saved to /var/cache/conftool/dbconfig/20230509-123447-ladsgroup.json |
[production] |
12:31 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host eventlog1003.eqiad.wmnet |
[production] |
12:29 |
<jmm@cumin2002> |
START - Cookbook sre.ganeti.reimage for host ldap-rw2001.wikimedia.org with OS bullseye |
[production] |
12:27 |
<btullis@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host eventlog1003.eqiad.wmnet |
[production] |
12:26 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P47995 and previous config saved to /var/cache/conftool/dbconfig/20230509-122608-ladsgroup.json |
[production] |
12:19 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2145 (T335845)', diff saved to https://phabricator.wikimedia.org/P47994 and previous config saved to /var/cache/conftool/dbconfig/20230509-121941-ladsgroup.json |
[production] |
12:14 |
<eoghan@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts aphlict1001.eqiad.wmnet |
[production] |
12:14 |
<eoghan@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
12:14 |
<eoghan@cumin1001> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: aphlict1001.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - eoghan@cumin1001" |
[production] |
12:11 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2145 (T335845)', diff saved to https://phabricator.wikimedia.org/P47992 and previous config saved to /var/cache/conftool/dbconfig/20230509-121119-ladsgroup.json |
[production] |
12:11 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance |
[production] |