851-900 of 10000 results (97ms)
2024-06-06 ยง
08:50 <dcaro@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephosd1031.eqiad.wmnet [production]
08:47 <mvernon@cumin1002> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-be1003.eqiad.wmnet [production]
08:44 <dcaro@cumin1002> START - Cookbook sre.hosts.reboot-single for host cloudcephosd1031.eqiad.wmnet [production]
08:44 <pfischer@deploy1002> helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply [production]
08:43 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host thanos-be2002.codfw.wmnet [production]
08:40 <mvernon@cumin2002> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-be2001.codfw.wmnet [production]
08:39 <sfaci@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply [production]
08:39 <sfaci@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply [production]
08:38 <filippo@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host prometheus2005.codfw.wmnet [production]
08:37 <arnaudb@cumin1002> dbctl commit (dc=all): 'db1246 (re)pooling @ 2%: post maintenance repool', diff saved to https://phabricator.wikimedia.org/P64171 and previous config saved to /var/cache/conftool/dbconfig/20240606-083710-arnaudb.json [production]
08:36 <mvernon@cumin1002> START - Cookbook sre.hosts.reboot-single for host thanos-be1003.eqiad.wmnet [production]
08:35 <pfischer@deploy1002> helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply [production]
08:35 <pfischer@deploy1002> helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply [production]
08:19 <filippo@cumin1002> START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet [production]
08:17 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219 (T364299)', diff saved to https://phabricator.wikimedia.org/P64167 and previous config saved to /var/cache/conftool/dbconfig/20240606-081753-marostegui.json [production]
08:14 <stevemunene@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
08:14 <stevemunene@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
08:14 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P64166 and previous config saved to /var/cache/conftool/dbconfig/20240606-081412-ladsgroup.json [production]
08:02 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P64165 and previous config saved to /var/cache/conftool/dbconfig/20240606-080245-marostegui.json [production]
08:02 <mvernon@cumin1002> START - Cookbook sre.hosts.reboot-single for host thanos-be1002.eqiad.wmnet [production]
08:01 <mvernon@cumin1002> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-be1001.eqiad.wmnet [production]
08:00 <urbanecm@deploy1002> Started scap: Backport for [[gerrit:1039287|Add throttle exception for an upcoming workshop (T366748)]] [production]
07:59 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P64164 and previous config saved to /var/cache/conftool/dbconfig/20240606-075904-ladsgroup.json [production]
07:50 <mvernon@cumin1002> START - Cookbook sre.hosts.reboot-single for host thanos-be1001.eqiad.wmnet [production]
07:47 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P64163 and previous config saved to /var/cache/conftool/dbconfig/20240606-074737-marostegui.json [production]
07:43 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1169 (T352010)', diff saved to https://phabricator.wikimedia.org/P64162 and previous config saved to /var/cache/conftool/dbconfig/20240606-074356-ladsgroup.json [production]
07:32 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219 (T364299)', diff saved to https://phabricator.wikimedia.org/P64161 and previous config saved to /var/cache/conftool/dbconfig/20240606-073229-marostegui.json [production]
07:30 <ryankemper@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555 [production]
07:06 <hashar> Restarting Gerrit [production]
07:05 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Depooling db2116 (T352010)', diff saved to https://phabricator.wikimedia.org/P64160 and previous config saved to /var/cache/conftool/dbconfig/20240606-070558-ladsgroup.json [production]
07:05 <ladsgroup@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance [production]
07:05 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance [production]
06:56 <dcaro@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephosd1034.eqiad.wmnet [production]
06:49 <dcaro@cumin1002> START - Cookbook sre.hosts.reboot-single for host cloudcephosd1034.eqiad.wmnet [production]
05:40 <ryankemper@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) reloading wikidata_full on wdqs2023.codfw.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/full/20240527/ using stat1009.eqiad.wmnet) [production]
05:21 <ryankemper@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555 [production]
05:19 <ryankemper@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555 [production]
05:04 <ryankemper@cumin2002> START - Cookbook sre.wdqs.data-reload reloading wikidata_full on wdqs2023.codfw.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/full/20240527/ using stat1009.eqiad.wmnet) [production]
05:02 <ryankemper@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555 [production]
04:17 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2219 (T364299)', diff saved to https://phabricator.wikimedia.org/P64159 and previous config saved to /var/cache/conftool/dbconfig/20240606-041714-marostegui.json [production]
04:17 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2219.codfw.wmnet with reason: Maintenance [production]
04:16 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 6:00:00 on db2219.codfw.wmnet with reason: Maintenance [production]
04:16 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210 (T364299)', diff saved to https://phabricator.wikimedia.org/P64158 and previous config saved to /var/cache/conftool/dbconfig/20240606-041650-marostegui.json [production]
04:01 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P64157 and previous config saved to /var/cache/conftool/dbconfig/20240606-040142-marostegui.json [production]
03:47 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Depooling db1193 (T352010)', diff saved to https://phabricator.wikimedia.org/P64156 and previous config saved to /var/cache/conftool/dbconfig/20240606-034732-ladsgroup.json [production]
03:47 <ladsgroup@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance [production]
03:47 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance [production]
03:47 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1178 (T352010)', diff saved to https://phabricator.wikimedia.org/P64155 and previous config saved to /var/cache/conftool/dbconfig/20240606-034709-ladsgroup.json [production]
03:46 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P64154 and previous config saved to /var/cache/conftool/dbconfig/20240606-034635-marostegui.json [production]
03:32 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P64153 and previous config saved to /var/cache/conftool/dbconfig/20240606-033201-ladsgroup.json [production]