4501-4550 of 10000 results (99ms)
2023-07-07 §
10:09 <moritzm> rebooting puppetserver1001 [production]
10:06 <jmm@cumin2002> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host puppetdb2003.codfw.wmnet [production]
10:05 <moritzm> rebooting puppetserver2001 [production]
10:05 <jiji@deploy1002> helmfile [staging] DONE helmfile.d/services/ipoid: apply [production]
10:03 <jiji@deploy1002> helmfile [staging] START helmfile.d/services/ipoid: apply [production]
09:59 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet [production]
09:55 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host puppetdb2003.codfw.wmnet [production]
09:55 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet [production]
09:52 <jmm@cumin2002> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host debmonitor2003.codfw.wmnet [production]
09:52 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet [production]
09:46 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet [production]
09:46 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet [production]
09:45 <stevemunene@cumin1001> END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons. [production]
09:39 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor1003.eqiad.wmnet [production]
09:37 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet [production]
09:35 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host debmonitor1003.eqiad.wmnet [production]
09:34 <jmm@cumin2002> END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host lists1003.wikimedia.org [production]
09:33 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet [production]
09:29 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet [production]
09:29 <stevemunene@cumin1001> START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons. [production]
09:26 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet [production]
09:24 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3002.esams.wmnet [production]
09:24 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host lists1003.wikimedia.org [production]
09:20 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people1004.eqiad.wmnet [production]
09:19 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host people1004.eqiad.wmnet [production]
09:19 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host netflow3002.esams.wmnet [production]
09:18 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5002.eqsin.wmnet [production]
09:17 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2003.codfw.wmnet [production]
09:13 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host people2003.codfw.wmnet [production]
09:12 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host netflow5002.eqsin.wmnet [production]
08:53 <btullis@deploy1002> helmfile [staging] DONE helmfile.d/services/datahub: sync on main [production]
08:50 <btullis@deploy1002> helmfile [staging] START helmfile.d/services/datahub: apply on main [production]
08:48 <moritzm> installing bookworm kernel updates [production]
08:47 <jmm@cumin2002> END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: xhgui2002.codfw.wmnet [production]
08:47 <jmm@cumin2002> START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: xhgui2002.codfw.wmnet [production]
08:46 <jmm@cumin2002> END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: xhgui1002.eqiad.wmnet [production]
08:46 <jmm@cumin2002> START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: xhgui1002.eqiad.wmnet [production]
08:05 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on kafka-test[1006-1010].eqiad.wmnet with reason: resetting cluster [production]
08:05 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 0:30:00 on kafka-test[1006-1010].eqiad.wmnet with reason: resetting cluster [production]
01:55 <bking@cumin1001> END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) [production]
00:28 <bking@cumin1001> START - Cookbook sre.wdqs.data-transfer [production]
2023-07-06 §
23:14 <mutante> mx1001 - rm /usr/local/bin/otrs_aliases ; rm /lib/systemd/system/generate_otrs_aliases.* after deploying gerrit:932316 which renamed script and timer without absenting them [production]
23:08 <mutante> mx2001 - rm /usr/local/bin/otrs_aliases ; rm /lib/systemd/system/generate_otrs_aliases.* after deploying gerrit:932316 which renamed script and timer without absenting them [production]
21:12 <thcipriani@deploy1002> Finished scap: Clean up font directory [[gerrit:723652]] (duration: 06m 33s) [production]
21:10 <bking@deploy1002> Finished deploy [wdqs/wdqs@dff41b7]: 0.3.124 (duration: 14m 56s) [production]
21:06 <thcipriani@deploy1002> Started scap: Clean up font directory [[gerrit:723652]] [production]
21:04 <thcipriani@deploy1002> Finished scap: Backport for [[gerrit:936084|pawikibooks: Install Quiz extension (T340613)]] (duration: 12m 19s) [production]
20:55 <bking@deploy1002> Started deploy [wdqs/wdqs@dff41b7]: 0.3.124 [production]
20:54 <bking@deploy1002> Finished deploy [wdqs/wdqs@dff41b7]: 0.3.124 (duration: 00m 05s) [production]
20:54 <bking@deploy1002> Started deploy [wdqs/wdqs@dff41b7]: 0.3.124 [production]