2025-07-30
§
|
07:55 |
<mlitn@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1171239|Add new MediaSearch config/coefficients (T385286)]] |
[production] |
07:53 |
<jelto@cumin1003> |
END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 'https://gitlab.wikimedia.org/ https://gitlab-replica-b.wikimedia.org/' on all recursors |
[production] |
07:53 |
<jelto@cumin1003> |
START - Cookbook sre.dns.wipe-cache 'https://gitlab.wikimedia.org/ https://gitlab-replica-b.wikimedia.org/' on all recursors |
[production] |
07:51 |
<jelto@cumin1003> |
END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 'https://gitlab.wikimedia.org/ https://gitlab-replica-b.wikimedia.org/' on all recursors |
[production] |
07:51 |
<jelto@cumin1003> |
START - Cookbook sre.dns.wipe-cache 'https://gitlab.wikimedia.org/ https://gitlab-replica-b.wikimedia.org/' on all recursors |
[production] |
07:50 |
<jelto@cumin1003> |
END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 'https://gitlab.wikimedia.org/ https://gitlab-replica-b.wikimedia.org/' on all recursors |
[production] |
07:50 |
<jelto@cumin1003> |
START - Cookbook sre.dns.wipe-cache 'https://gitlab.wikimedia.org/ https://gitlab-replica-b.wikimedia.org/' on all recursors |
[production] |
07:50 |
<jelto@dns1004> |
END - running authdns-update |
[production] |
07:50 |
<elukey@cumin1003> |
START - Cookbook sre.hosts.provision for host sretest2010.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART |
[production] |
07:49 |
<jelto@dns1004> |
START - running authdns-update |
[production] |
07:42 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db1160 (T399728)', diff saved to https://phabricator.wikimedia.org/P80272 and previous config saved to /var/cache/conftool/dbconfig/20250730-074213-fceratto.json |
[production] |
07:35 |
<fceratto@cumin1002> |
dbctl commit (dc=all): 'Depooling db1160 (T399728)', diff saved to https://phabricator.wikimedia.org/P80271 and previous config saved to /var/cache/conftool/dbconfig/20250730-073517-fceratto.json |
[production] |
07:35 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1160.eqiad.wmnet with reason: Maintenance |
[production] |
07:31 |
<fceratto@cumin1002> |
DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance |
[production] |
06:37 |
<jelto@cumin1003> |
START - Cookbook sre.gitlab.failover Failover of gitlab from gitlab2002.wikimedia.org to gitlab1004.wikimedia.org |
[production] |
01:11 |
<mwpresync@deploy1003> |
Finished scap build-images: Publishing wmf/next image (duration: 10m 52s) |
[production] |
01:00 |
<mwpresync@deploy1003> |
Started scap build-images: Publishing wmf/next image |
[production] |
2025-07-29
§
|
23:10 |
<cwhite@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logstash2035.codfw.wmnet with OS bookworm |
[production] |
22:48 |
<cwhite@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash2035.codfw.wmnet with reason: host reimage |
[production] |
22:42 |
<cwhite@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on logstash2035.codfw.wmnet with reason: host reimage |
[production] |
22:24 |
<ryankemper@cumin2002> |
START - Cookbook sre.wdqs.data-reload reloading wikidata_main on wdqs1022.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20250714/ using stat1009.eqiad.wmnet) |
[production] |
22:23 |
<cwhite@cumin2002> |
END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host logstash2035 |
[production] |
22:23 |
<cwhite@cumin2002> |
END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logstash2035 |
[production] |
22:19 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch2091.codfw.wmnet with OS bullseye |
[production] |
22:15 |
<kemayo@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1172397|Enable DiscussionTools thanks on existing "report incident" wikis (T366095)]] (duration: 12m 28s) |
[production] |
22:15 |
<cwhite@cumin2002> |
START - Cookbook sre.network.configure-switch-interfaces for host logstash2035 |
[production] |
22:15 |
<cwhite@cumin2002> |
END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) logstash2035.codfw.wmnet 28.32.192.10.in-addr.arpa 8.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors |
[production] |
22:15 |
<cwhite@cumin2002> |
START - Cookbook sre.dns.wipe-cache logstash2035.codfw.wmnet 28.32.192.10.in-addr.arpa 8.2.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors |
[production] |
22:15 |
<cwhite@cumin2002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
22:15 |
<cwhite@cumin2002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host logstash2035 - cwhite@cumin2002" |
[production] |
22:15 |
<cwhite@cumin2002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host logstash2035 - cwhite@cumin2002" |
[production] |
22:14 |
<ryankemper@cumin2002> |
END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) reloading wikidata_main on wdqs1022.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20250714/ using stat1009.eqiad.wmnet) |
[production] |
22:10 |
<cwhite@cumin2002> |
START - Cookbook sre.dns.netbox |
[production] |
22:10 |
<cwhite@cumin2002> |
START - Cookbook sre.hosts.move-vlan for host logstash2035 |
[production] |
22:10 |
<kemayo@deploy1003> |
kemayo: Continuing with sync |
[production] |
22:09 |
<cwhite@cumin2002> |
START - Cookbook sre.hosts.reimage for host logstash2035.codfw.wmnet with OS bookworm |
[production] |
22:05 |
<kemayo@deploy1003> |
kemayo: Backport for [[gerrit:1172397|Enable DiscussionTools thanks on existing "report incident" wikis (T366095)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. |
[production] |
22:03 |
<kemayo@deploy1003> |
Started scap sync-world: Backport for [[gerrit:1172397|Enable DiscussionTools thanks on existing "report incident" wikis (T366095)]] |
[production] |
21:58 |
<bking@cumin2002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch2091.codfw.wmnet with reason: host reimage |
[production] |
21:51 |
<bking@cumin2002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch2091.codfw.wmnet with reason: host reimage |
[production] |
21:34 |
<bking@cumin2002> |
START - Cookbook sre.hosts.reimage for host cirrussearch2091.codfw.wmnet with OS bullseye |
[production] |
21:19 |
<ryankemper@cumin2002> |
START - Cookbook sre.wdqs.data-reload reloading wikidata_main on wdqs1022.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20250714/ using stat1009.eqiad.wmnet) |
[production] |
21:16 |
<ryankemper@cumin1002> |
END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) reloading wikidata_main on wdqs1022.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/rdf_subgraphs/snapshot=20250714/wiki=wikidata/scope=wikidata_main/ using stat1009.eqiad.wmnet) |
[production] |
21:09 |
<ryankemper@cumin1002> |
START - Cookbook sre.wdqs.data-reload reloading wikidata_main on wdqs1022.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/rdf_subgraphs/snapshot=20250714/wiki=wikidata/scope=wikidata_main/ using stat1009.eqiad.wmnet) |
[production] |
21:03 |
<mwpresync@deploy1003> |
Finished scap build-images: Publishing wmf/next image (duration: 00m 57s) |
[production] |
21:02 |
<mwpresync@deploy1003> |
Started scap build-images: Publishing wmf/next image |
[production] |
20:42 |
<cdanis@deploy1003> |
Finished scap sync-world: Backport for [[gerrit:1174016|probenet: Report CDN host handling each measure request (T398596)]] (duration: 10m 27s) |
[production] |
20:42 |
<vriley@cumin1002> |
START - Cookbook sre.hosts.reimage for host cloudcephosd1042.eqiad.wmnet with OS bullseye |
[production] |
20:41 |
<vriley@cumin1002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1042.eqiad.wmnet with OS bullseye |
[production] |
20:37 |
<cdanis@deploy1003> |
cdanis: Continuing with sync |
[production] |