3601-3650 of 10000 results (104ms)
2024-06-06 ยง
08:35 <pfischer@deploy1002> helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply [production]
08:19 <filippo@cumin1002> START - Cookbook sre.hosts.reboot-single for host prometheus1005.eqiad.wmnet [production]
08:17 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219 (T364299)', diff saved to https://phabricator.wikimedia.org/P64167 and previous config saved to /var/cache/conftool/dbconfig/20240606-081753-marostegui.json [production]
08:14 <stevemunene@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
08:14 <stevemunene@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
08:14 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P64166 and previous config saved to /var/cache/conftool/dbconfig/20240606-081412-ladsgroup.json [production]
08:02 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P64165 and previous config saved to /var/cache/conftool/dbconfig/20240606-080245-marostegui.json [production]
08:02 <mvernon@cumin1002> START - Cookbook sre.hosts.reboot-single for host thanos-be1002.eqiad.wmnet [production]
08:01 <mvernon@cumin1002> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host thanos-be1001.eqiad.wmnet [production]
08:00 <urbanecm@deploy1002> Started scap: Backport for [[gerrit:1039287|Add throttle exception for an upcoming workshop (T366748)]] [production]
07:59 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P64164 and previous config saved to /var/cache/conftool/dbconfig/20240606-075904-ladsgroup.json [production]
07:50 <mvernon@cumin1002> START - Cookbook sre.hosts.reboot-single for host thanos-be1001.eqiad.wmnet [production]
07:47 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P64163 and previous config saved to /var/cache/conftool/dbconfig/20240606-074737-marostegui.json [production]
07:43 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1169 (T352010)', diff saved to https://phabricator.wikimedia.org/P64162 and previous config saved to /var/cache/conftool/dbconfig/20240606-074356-ladsgroup.json [production]
07:32 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2219 (T364299)', diff saved to https://phabricator.wikimedia.org/P64161 and previous config saved to /var/cache/conftool/dbconfig/20240606-073229-marostegui.json [production]
07:30 <ryankemper@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555 [production]
07:06 <hashar> Restarting Gerrit [production]
07:05 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Depooling db2116 (T352010)', diff saved to https://phabricator.wikimedia.org/P64160 and previous config saved to /var/cache/conftool/dbconfig/20240606-070558-ladsgroup.json [production]
07:05 <ladsgroup@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance [production]
07:05 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance [production]
06:56 <dcaro@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcephosd1034.eqiad.wmnet [production]
06:49 <dcaro@cumin1002> START - Cookbook sre.hosts.reboot-single for host cloudcephosd1034.eqiad.wmnet [production]
05:40 <ryankemper@cumin2002> END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) reloading wikidata_full on wdqs2023.codfw.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/full/20240527/ using stat1009.eqiad.wmnet) [production]
05:21 <ryankemper@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555 [production]
05:19 <ryankemper@cumin2002> END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555 [production]
05:04 <ryankemper@cumin2002> START - Cookbook sre.wdqs.data-reload reloading wikidata_full on wdqs2023.codfw.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/full/20240527/ using stat1009.eqiad.wmnet) [production]
05:02 <ryankemper@cumin2002> START - Cookbook sre.elasticsearch.rolling-operation Operation.REBOOT (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster restart - ryankemper@cumin2002 - T366555 [production]
04:17 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2219 (T364299)', diff saved to https://phabricator.wikimedia.org/P64159 and previous config saved to /var/cache/conftool/dbconfig/20240606-041714-marostegui.json [production]
04:17 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2219.codfw.wmnet with reason: Maintenance [production]
04:16 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 6:00:00 on db2219.codfw.wmnet with reason: Maintenance [production]
04:16 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210 (T364299)', diff saved to https://phabricator.wikimedia.org/P64158 and previous config saved to /var/cache/conftool/dbconfig/20240606-041650-marostegui.json [production]
04:01 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P64157 and previous config saved to /var/cache/conftool/dbconfig/20240606-040142-marostegui.json [production]
03:47 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Depooling db1193 (T352010)', diff saved to https://phabricator.wikimedia.org/P64156 and previous config saved to /var/cache/conftool/dbconfig/20240606-034732-ladsgroup.json [production]
03:47 <ladsgroup@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance [production]
03:47 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1193.eqiad.wmnet with reason: Maintenance [production]
03:47 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1178 (T352010)', diff saved to https://phabricator.wikimedia.org/P64155 and previous config saved to /var/cache/conftool/dbconfig/20240606-034709-ladsgroup.json [production]
03:46 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P64154 and previous config saved to /var/cache/conftool/dbconfig/20240606-034635-marostegui.json [production]
03:32 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P64153 and previous config saved to /var/cache/conftool/dbconfig/20240606-033201-ladsgroup.json [production]
03:31 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2210 (T364299)', diff saved to https://phabricator.wikimedia.org/P64152 and previous config saved to /var/cache/conftool/dbconfig/20240606-033125-marostegui.json [production]
03:29 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Depooling db2161 (T352010)', diff saved to https://phabricator.wikimedia.org/P64151 and previous config saved to /var/cache/conftool/dbconfig/20240606-032907-ladsgroup.json [production]
03:29 <ladsgroup@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance [production]
03:28 <ladsgroup@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2161.codfw.wmnet with reason: Maintenance [production]
03:28 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2154 (T352010)', diff saved to https://phabricator.wikimedia.org/P64150 and previous config saved to /var/cache/conftool/dbconfig/20240606-032844-ladsgroup.json [production]
03:16 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P64149 and previous config saved to /var/cache/conftool/dbconfig/20240606-031653-ladsgroup.json [production]
03:13 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P64148 and previous config saved to /var/cache/conftool/dbconfig/20240606-031336-ladsgroup.json [production]
03:01 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1178 (T352010)', diff saved to https://phabricator.wikimedia.org/P64147 and previous config saved to /var/cache/conftool/dbconfig/20240606-030145-ladsgroup.json [production]
02:58 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P64146 and previous config saved to /var/cache/conftool/dbconfig/20240606-025828-ladsgroup.json [production]
02:43 <ladsgroup@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2154 (T352010)', diff saved to https://phabricator.wikimedia.org/P64145 and previous config saved to /var/cache/conftool/dbconfig/20240606-024321-ladsgroup.json [production]
01:22 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db1244 (T364069)', diff saved to https://phabricator.wikimedia.org/P64144 and previous config saved to /var/cache/conftool/dbconfig/20240606-012208-marostegui.json [production]
01:22 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1244.eqiad.wmnet with reason: Maintenance [production]