2024-06-05
ยง
|
20:03 |
<urbanecm@deploy1002> |
Started scap: Backport for [[gerrit:1038740|[CheckUser] Stop writing old for event tables migration on group0 (T360685)]], [[gerrit:1038882|Growth: Use `growthexperiments` DB list for enabling GrowthExperiments (T364892)]], [[gerrit:1035473|[Beta] Enable CommunityConfiguration extension in all wikis (T364892)]] |
[production] |
20:02 |
<jhathaway@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mx-in1001.wikimedia.org with reason: host reimage |
[production] |
19:57 |
<jhathaway@cumin1002> |
START - Cookbook sre.hosts.downtime for 2:00:00 on mx-in1001.wikimedia.org with reason: host reimage |
[production] |
19:47 |
<jhathaway@cumin1002> |
START - Cookbook sre.hosts.reimage for host mx-in1001.wikimedia.org with OS bookworm |
[production] |
19:45 |
<jhathaway@cumin1002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM mx-in1001.wikimedia.org - jhathaway@cumin1002" |
[production] |
19:44 |
<jhathaway@cumin1002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM mx-in1001.wikimedia.org - jhathaway@cumin1002" |
[production] |
19:43 |
<jhathaway@cumin1002> |
END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) mx-in1001.wikimedia.org on all recursors |
[production] |
19:43 |
<jhathaway@cumin1002> |
START - Cookbook sre.dns.wipe-cache mx-in1001.wikimedia.org on all recursors |
[production] |
19:43 |
<jhathaway@cumin1002> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
19:43 |
<jhathaway@cumin1002> |
END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM mx-in1001.wikimedia.org - jhathaway@cumin1002" |
[production] |
19:38 |
<jhathaway@cumin1002> |
START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM mx-in1001.wikimedia.org - jhathaway@cumin1002" |
[production] |
19:36 |
<jhathaway@cumin1002> |
START - Cookbook sre.dns.netbox |
[production] |
19:36 |
<jhathaway@cumin1002> |
START - Cookbook sre.ganeti.makevm for new host mx-in1001.wikimedia.org |
[production] |
19:27 |
<ryankemper@cumin2002> |
END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) reloading wikidata_full on wdqs2023.codfw.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/full/20240527/ using stat1009.eqiad.wmnet) |
[production] |
19:09 |
<swfrench@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/data-gateway: apply |
[production] |
18:58 |
<swfrench@deploy1002> |
helmfile [codfw] START helmfile.d/services/data-gateway: apply |
[production] |
18:53 |
<dduvall@deploy1002> |
rebuilt and synchronized wikiversions files: group1 wikis to 1.43.0-wmf.8 refs T361402 |
[production] |
18:53 |
<ryankemper@cumin2002> |
START - Cookbook sre.wdqs.data-reload reloading wikidata_full on wdqs2023.codfw.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/full/20240527/ using stat1009.eqiad.wmnet) |
[production] |
18:42 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2203 (T352010)', diff saved to https://phabricator.wikimedia.org/P64132 and previous config saved to /var/cache/conftool/dbconfig/20240605-184250-ladsgroup.json |
[production] |
18:27 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P64131 and previous config saved to /var/cache/conftool/dbconfig/20240605-182742-ladsgroup.json |
[production] |
18:13 |
<swfrench@deploy1002> |
helmfile [staging] DONE helmfile.d/services/data-gateway: apply |
[production] |
18:12 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P64130 and previous config saved to /var/cache/conftool/dbconfig/20240605-181234-ladsgroup.json |
[production] |
18:12 |
<swfrench@deploy1002> |
helmfile [staging] START helmfile.d/services/data-gateway: apply |
[production] |
18:11 |
<aokoth@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1001.eqiad.wmnet |
[production] |
18:07 |
<aokoth@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host vrts1001.eqiad.wmnet |
[production] |
18:06 |
<ryankemper@cumin2002> |
END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) reloading wikidata_full on wdqs2023.codfw.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/full/20240527/ using stat1009.eqiad.wmnet) |
[production] |
17:57 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2203 (T352010)', diff saved to https://phabricator.wikimedia.org/P64129 and previous config saved to /var/cache/conftool/dbconfig/20240605-175725-ladsgroup.json |
[production] |
17:55 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2207 (T352010)', diff saved to https://phabricator.wikimedia.org/P64128 and previous config saved to /var/cache/conftool/dbconfig/20240605-175503-ladsgroup.json |
[production] |
17:50 |
<kamila@cumin1002> |
START - Cookbook sre.hosts.dhcp for host wikikube-ctrl1001.eqiad.wmnet |
[production] |
17:47 |
<marostegui@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2199.codfw.wmnet with reason: Maintenance |
[production] |
17:47 |
<marostegui@cumin1002> |
START - Cookbook sre.hosts.downtime for 6:00:00 on db2199.codfw.wmnet with reason: Maintenance |
[production] |
17:47 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2179 (T364299)', diff saved to https://phabricator.wikimedia.org/P64127 and previous config saved to /var/cache/conftool/dbconfig/20240605-174724-marostegui.json |
[production] |
17:42 |
<ladsgroup@deploy1002> |
Finished scap: Backport for [[gerrit:1039256|Stop writing to pagelinks old columns in enwiki (T352010)]] (duration: 12m 19s) |
[production] |
17:39 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P64126 and previous config saved to /var/cache/conftool/dbconfig/20240605-173954-ladsgroup.json |
[production] |
17:33 |
<ladsgroup@deploy1002> |
ladsgroup: Continuing with sync |
[production] |
17:32 |
<ladsgroup@deploy1002> |
ladsgroup: Backport for [[gerrit:1039256|Stop writing to pagelinks old columns in enwiki (T352010)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) |
[production] |
17:32 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P64125 and previous config saved to /var/cache/conftool/dbconfig/20240605-173216-marostegui.json |
[production] |
17:31 |
<ryankemper@cumin2002> |
START - Cookbook sre.wdqs.data-reload reloading wikidata_full on wdqs2023.codfw.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/full/20240527/ using stat1009.eqiad.wmnet) |
[production] |
17:29 |
<ladsgroup@deploy1002> |
Started scap: Backport for [[gerrit:1039256|Stop writing to pagelinks old columns in enwiki (T352010)]] |
[production] |
17:27 |
<kamila@cumin1002> |
END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['wikikube-ctrl1001'] |
[production] |
17:24 |
<ryankemper@cumin2002> |
END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) reloading wikidata_full on wdqs2023.codfw.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/full/20240527/ using stat1009.eqiad.wmnet) |
[production] |
17:24 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P64124 and previous config saved to /var/cache/conftool/dbconfig/20240605-172446-ladsgroup.json |
[production] |
17:17 |
<marostegui@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2179', diff saved to https://phabricator.wikimedia.org/P64123 and previous config saved to /var/cache/conftool/dbconfig/20240605-171708-marostegui.json |
[production] |
17:13 |
<kamila@cumin1002> |
START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['wikikube-ctrl1001'] |
[production] |
17:12 |
<kamila@cumin1002> |
END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wikikube-ctrl1001.eqiad.wmnet with OS bullseye |
[production] |
17:10 |
<jhathaway> |
phabricator email now egressing via mx-out{1001,2001}.wikimedia.org, which should solve the SPF warnings in your inbox |
[production] |
17:10 |
<dcaro@cumin1002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephosd1033.eqiad.wmnet |
[production] |
17:09 |
<ladsgroup@cumin1002> |
dbctl commit (dc=all): 'Repooling after maintenance db2207 (T352010)', diff saved to https://phabricator.wikimedia.org/P64122 and previous config saved to /var/cache/conftool/dbconfig/20240605-170938-ladsgroup.json |
[production] |
17:06 |
<dzahn@cumin1002> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on stat1007.eqiad.wmnet with reason: decom T353785 |
[production] |
17:06 |
<dcaro@cumin1002> |
START - Cookbook sre.hosts.reboot-single for host cloudcephosd1033.eqiad.wmnet |
[production] |