1801-1850 of 10000 results (63ms)
2023-07-18 §
09:24 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet [production]
09:24 <XioNoX> remove asw-b1-codfw from asw-b-codfw VC - T342076 [production]
09:21 <isaranto@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . [production]
09:21 <isaranto@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . [production]
09:20 <isaranto@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' . [production]
09:18 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host thanos-be1001.eqiad.wmnet [production]
09:17 <isaranto@deploy1002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . [production]
09:16 <isaranto@deploy1002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . [production]
09:16 <mvernon@cumin1001> START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet [production]
09:15 <isaranto@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' . [production]
09:10 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet [production]
09:09 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet [production]
09:08 <ladsgroup@deploy1002> Finished scap: Backport for [[gerrit:937453|ores: use envoy proxy for Lift Wing (T319170)]] (duration: 14m 56s) [production]
09:07 <isaranto@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' . [production]
09:02 <mvernon@cumin1001> START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet [production]
09:02 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet [production]
08:58 <fabfur> enable puppet on A:cp-eqiad for https://gerrit.wikimedia.org/r/939235 (T340983) (hosts will run puppet with the usual schedule) [production]
08:57 <ladsgroup@deploy1002> isaranto and ladsgroup: Backport for [[gerrit:937453|ores: use envoy proxy for Lift Wing (T319170)]] synced to the testservers mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option) [production]
08:56 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet [production]
08:56 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet [production]
08:55 <fabfur> disable puppet on A:cp-eqiad to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/939235 (T340983) [production]
08:53 <ladsgroup@deploy1002> Started scap: Backport for [[gerrit:937453|ores: use envoy proxy for Lift Wing (T319170)]] [production]
08:48 <mvernon@cumin1001> START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet [production]
08:48 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet [production]
08:47 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet [production]
08:46 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet [production]
08:37 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet [production]
08:37 <mvernon@cumin1001> START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet [production]
08:36 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet [production]
08:34 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet [production]
08:28 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet [production]
08:27 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet [production]
08:25 <mvernon@cumin1001> START - Cookbook sre.hosts.reboot-single for host ms-be1070.eqiad.wmnet [production]
08:25 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1069.eqiad.wmnet [production]
08:18 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet [production]
08:17 <fabfur> enable puppet on A:cp-drmrs for https://gerrit.wikimedia.org/r/c/operations/puppet/+/938902/ (T340983) (hosts will run puppet with the usual schedule) [production]
08:16 <mvernon@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet [production]
08:13 <fabfur> disable puppet on A:cp-drmrs to apply https://gerrit.wikimedia.org/r/c/operations/puppet/+/938902/ (T340983) [production]
08:09 <topranks> cr3-knams going offline for move [production]
08:08 <mvernon@cumin1001> START - Cookbook sre.hosts.reboot-single for host ms-be1069.eqiad.wmnet [production]
08:08 <mvernon@cumin2002> START - Cookbook sre.hosts.reboot-single for host ms-be2068.codfw.wmnet [production]
07:16 <elukey> restart kafka main-codfw rebalances (long maintenance) - T341558 [production]
06:48 <XioNoX> disable asw-b-codfw:ae0 (to cloudsw1-b1-codfw) - T342076 [production]
06:36 <cmooney@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cr3-knams,cr3-knams IPv6 with reason: Downtime cr3-knams ahead of remote hands moving router [production]
06:36 <cmooney@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cr3-knams,cr3-knams IPv6 with reason: Downtime cr3-knams ahead of remote hands moving router [production]
2023-07-17 §
21:57 <btullis@deploy1002> Finished deploy [analytics/aqs/deploy@91f8d92] (aqs-next): Deploying new AQS endpoint (duration: 02m 10s) [production]
21:55 <btullis@deploy1002> Started deploy [analytics/aqs/deploy@91f8d92] (aqs-next): Deploying new AQS endpoint [production]
21:55 <btullis@deploy1002> Finished deploy [analytics/aqs/deploy@91f8d92] (aqs-next): Deploying new AQS endpoint (duration: 136m 46s) [production]
21:53 <bking@cumin1001> END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host flink-zk1001.eqiad.wmnet [production]
21:52 <bking@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host flink-zk1001.eqiad.wmnet with OS bookworm [production]