1451-1500 of 10000 results (127ms)
2025-04-17 ยง
20:06 <bking@cumin2002> END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cirrussearch2058 [production]
20:06 <bking@cumin2002> START - Cookbook sre.network.configure-switch-interfaces for host cirrussearch2058 [production]
20:06 <bking@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
20:06 <bking@cumin2002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming elastic2058 to cirrussearch2058 - bking@cumin2002" [production]
20:05 <bking@cumin2002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming elastic2058 to cirrussearch2058 - bking@cumin2002" [production]
20:02 <vriley@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1181.eqiad.wmnet with reason: host reimage [production]
20:00 <bking@cumin2002> START - Cookbook sre.dns.netbox [production]
20:00 <bking@cumin2002> START - Cookbook sre.hosts.rename from elastic2058 to cirrussearch2058 [production]
20:00 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1248 (T391056)', diff saved to https://phabricator.wikimedia.org/P75230 and previous config saved to /var/cache/conftool/dbconfig/20250417-200008-fceratto.json [production]
19:59 <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: reimage row B - bking@cumin2002 - T388610 [production]
19:59 <bking@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: reimage row B - bking@cumin2002 - T388610 [production]
19:58 <bking@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.REIMAGE (3 nodes at a time) for ElasticSearch cluster search_codfw: reimage row B - bking@cumin2002 - T388610 [production]
19:58 <vriley@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1181.eqiad.wmnet with reason: host reimage [production]
19:55 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db1248 (T391056)', diff saved to https://phabricator.wikimedia.org/P75229 and previous config saved to /var/cache/conftool/dbconfig/20250417-195506-fceratto.json [production]
19:54 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1248.eqiad.wmnet with reason: Maintenance [production]
19:54 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1247 (T391056)', diff saved to https://phabricator.wikimedia.org/P75228 and previous config saved to /var/cache/conftool/dbconfig/20250417-195442-fceratto.json [production]
19:50 <vriley@cumin1002> START - Cookbook sre.hosts.provision for host an-worker1178.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
19:50 <vriley@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1178.eqiad.wmnet with OS bullseye [production]
19:44 <vriley@cumin1002> START - Cookbook sre.hosts.reimage for host an-worker1181.eqiad.wmnet with OS bullseye [production]
19:43 <vriley@cumin1002> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host an-worker1181.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
19:42 <vriley@cumin1002> START - Cookbook sre.hosts.provision for host an-worker1181.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
19:39 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P75226 and previous config saved to /var/cache/conftool/dbconfig/20250417-193935-fceratto.json [production]
19:36 <vriley@cumin1002> START - Cookbook sre.hosts.reimage for host an-worker1178.eqiad.wmnet with OS bullseye [production]
19:35 <vriley@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1178.eqiad.wmnet with OS bullseye [production]
19:24 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P75225 and previous config saved to /var/cache/conftool/dbconfig/20250417-192430-fceratto.json [production]
19:22 <vriley@cumin1002> START - Cookbook sre.hosts.reimage for host an-worker1178.eqiad.wmnet with OS bullseye [production]
19:21 <vriley@cumin1002> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host an-worker1178.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
19:14 <vriley@cumin1002> START - Cookbook sre.hosts.provision for host an-worker1178.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
19:09 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1247 (T391056)', diff saved to https://phabricator.wikimedia.org/P75223 and previous config saved to /var/cache/conftool/dbconfig/20250417-190923-fceratto.json [production]
19:03 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db1247 (T391056)', diff saved to https://phabricator.wikimedia.org/P75222 and previous config saved to /var/cache/conftool/dbconfig/20250417-190331-fceratto.json [production]
19:03 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1247.eqiad.wmnet with reason: Maintenance [production]
18:59 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1245.eqiad.wmnet with reason: Maintenance [production]
18:59 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1243 (T391056)', diff saved to https://phabricator.wikimedia.org/P75221 and previous config saved to /var/cache/conftool/dbconfig/20250417-185930-fceratto.json [production]
18:44 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P75219 and previous config saved to /var/cache/conftool/dbconfig/20250417-184423-fceratto.json [production]
18:29 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P75218 and previous config saved to /var/cache/conftool/dbconfig/20250417-182916-fceratto.json [production]
18:14 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1243 (T391056)', diff saved to https://phabricator.wikimedia.org/P75217 and previous config saved to /var/cache/conftool/dbconfig/20250417-181408-fceratto.json [production]
18:13 <dduvall@deploy1003> rebuilt and synchronized wikiversions files: group2 to 1.44.0-wmf.25 refs T386220 [production]
17:56 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db1243 (T391056)', diff saved to https://phabricator.wikimedia.org/P75216 and previous config saved to /var/cache/conftool/dbconfig/20250417-175614-fceratto.json [production]
17:56 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1243.eqiad.wmnet with reason: Maintenance [production]
17:55 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1242 (T391056)', diff saved to https://phabricator.wikimedia.org/P75215 and previous config saved to /var/cache/conftool/dbconfig/20250417-175552-fceratto.json [production]
17:40 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P75214 and previous config saved to /var/cache/conftool/dbconfig/20250417-174046-fceratto.json [production]
17:25 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P75213 and previous config saved to /var/cache/conftool/dbconfig/20250417-172539-fceratto.json [production]
17:20 <mutante> idp-test2005 - 100% disk space used - alerting since over 6 days (is there a point in alerts for test hosts?) - apt-get clean .. brought it back to 94% [production]
17:12 <bd808@deploy1003> helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply [production]
17:11 <bd808@deploy1003> helmfile [eqiad] START helmfile.d/services/developer-portal: apply [production]
17:10 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1242 (T391056)', diff saved to https://phabricator.wikimedia.org/P75212 and previous config saved to /var/cache/conftool/dbconfig/20250417-171032-fceratto.json [production]
17:09 <bd808@deploy1003> helmfile [codfw] DONE helmfile.d/services/developer-portal: apply [production]
17:09 <bd808@deploy1003> helmfile [codfw] START helmfile.d/services/developer-portal: apply [production]
17:09 <bd808@deploy1003> helmfile [staging] DONE helmfile.d/services/developer-portal: apply [production]
17:08 <bd808@deploy1003> helmfile [staging] START helmfile.d/services/developer-portal: apply [production]