3401-3450 of 10000 results (85ms)
2022-12-08 ยง
20:50 <pt1979@cumin2002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host sretest2002.codfw.wmnet with OS bullseye [production]
20:35 <ryankemper@cumin1001> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - ryankemper@cumin1001 - T322776 [production]
20:34 <ryankemper> [Cloudelastic] Cleaned up stale (not running but files not removed) elasticsearch 6 units which broke the previous rolling upgrade run on cloudelastic1005 [production]
20:31 <ryankemper@cumin1001> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - ryankemper@cumin1001 - T322776 [production]
20:27 <bking@cumin2002> START - Cookbook sre.wdqs.data-reload [production]
20:27 <bking@cumin2002> END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97) [production]
20:22 <ryankemper@cumin1001> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic elasticsearch and plugin upgrade - ryankemper@cumin1001 - T322776 [production]
20:21 <ryankemper@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 6 hosts with reason: Plugin upgrade for T322776 [production]
20:21 <ryankemper@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on 6 hosts with reason: Plugin upgrade for T322776 [production]
20:17 <ryankemper> T323064 Merged https://gerrit.wikimedia.org/r/c/operations/grafana-grizzly/+/862178 and deployed new dashboard, visible here: https://grafana.wikimedia.org/d/slo-wdqs-tmpl/wdqs-slos-grizzly-template?orgId=1 [production]
20:12 <demon@deploy1002> rebuilt and synchronized wikiversions files: group2 wikis to 1.40.0-wmf.13 refs T320518 [production]
20:09 <ryankemper@cumin1001> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge elasticsearch and plugin upgrade - ryankemper@cumin1001 - T322776 [production]
19:59 <ryankemper@cumin1001> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge elasticsearch and plugin upgrade - ryankemper@cumin1001 - T322776 [production]
19:59 <ryankemper@cumin1001> START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (1 nodes at a time) for ElasticSearch cluster relforge: relforge elasticsearch and plugin upgrade - ryankemper@cumin1001 - T322776 [production]
19:53 <pt1979@cumin2002> START - Cookbook sre.hosts.reimage for host sretest2002.codfw.wmnet with OS bullseye [production]
16:14 <eevans@cumin1001> END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cassandra-dev2001 [production]
16:14 <eevans@cumin1001> START - Cookbook sre.network.configure-switch-interfaces for host cassandra-dev2001 [production]
16:13 <eevans@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
16:13 <eevans@cumin1001> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Rename restbase-dev2001 to cassandra-dev2001 - eevans@cumin1001" [production]
16:12 <eevans@cumin1001> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Rename restbase-dev2001 to cassandra-dev2001 - eevans@cumin1001" [production]
16:10 <eevans@cumin1001> START - Cookbook sre.dns.netbox [production]
16:08 <eevans@cumin1001> END (ERROR) - Cookbook sre.dns.netbox (exit_code=97) [production]
16:08 <eevans@cumin1001> START - Cookbook sre.dns.netbox [production]
16:02 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host thanos-be2002.codfw.wmnet with OS bullseye [production]
15:48 <cgoubert@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 365 days, 0:00:00 on contint1001.wikimedia.org with reason: awaiting decom [production]
15:48 <cgoubert@cumin1001> START - Cookbook sre.hosts.downtime for 365 days, 0:00:00 on contint1001.wikimedia.org with reason: awaiting decom [production]
15:45 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-be2002.codfw.wmnet with reason: host reimage [production]
15:42 <mvernon@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-be2002.codfw.wmnet with reason: host reimage [production]
15:31 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance [production]
15:31 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance [production]
15:31 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1202 (T322618)', diff saved to https://phabricator.wikimedia.org/P42654 and previous config saved to /var/cache/conftool/dbconfig/20221208-153123-ladsgroup.json [production]
15:27 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ganeti5002.eqsin.wmnet [production]
15:27 <jmm@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
15:27 <jmm@cumin2002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" [production]
15:27 <jiji@deploy1002> helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply [production]
15:26 <jiji@deploy1002> helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply [production]
15:25 <mvernon@cumin2002> START - Cookbook sre.hosts.reimage for host thanos-be2002.codfw.wmnet with OS bullseye [production]
15:24 <jmm@cumin2002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ganeti5002.eqsin.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002" [production]
15:21 <jmm@cumin2002> START - Cookbook sre.dns.netbox [production]
15:16 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P42653 and previous config saved to /var/cache/conftool/dbconfig/20221208-151616-ladsgroup.json [production]
15:15 <eevans@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts restbase-dev2001.codfw.wmnet [production]
15:15 <eevans@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
15:15 <eevans@cumin1001> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: restbase-dev2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1001" [production]
15:13 <eevans@cumin1001> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: restbase-dev2001.codfw.wmnet decommissioned, removing all IPs except the asset tag one - eevans@cumin1001" [production]
15:12 <hashar> Restarted Gerrit TWICE on gerrit1001.wikimedia.org to apply `-Dh2.maxCompactTime` and get it to trigger compaction # T323754 [production]
15:12 <jmm@cumin2002> START - Cookbook sre.hosts.decommission for hosts ganeti5002.eqsin.wmnet [production]
15:10 <eevans@cumin1001> START - Cookbook sre.dns.netbox [production]
15:09 <jiji@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mw-web: apply [production]
15:08 <jiji@deploy1002> helmfile [eqiad] START helmfile.d/services/mw-web: apply [production]
15:08 <jiji@deploy1002> helmfile [codfw] DONE helmfile.d/services/mw-web: apply [production]