1601-1650 of 10000 results (34ms)
2021-03-02 ยง
11:12 <hnowlan@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . [production]
11:11 <hnowlan@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'production' . [production]
10:37 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1028.eqiad.wmnet [production]
10:31 <jiji@cumin1001> START - Cookbook sre.hosts.reboot-single for host mc1028.eqiad.wmnet [production]
10:29 <effie> upgrade memcached on mc2024, mc1028 [production]
10:21 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1119.eqiad.wmnet [production]
10:18 <elukey@cumin1001> START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1119.eqiad.wmnet [production]
10:12 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1119.eqiad.wmnet with reason: REIMAGE [production]
10:09 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1119.eqiad.wmnet with reason: REIMAGE [production]
10:05 <volans@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE [production]
10:03 <volans@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE [production]
09:54 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker[1130-1131].eqiad.wmnet [production]
09:52 <elukey@cumin1001> START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1130-1131].eqiad.wmnet [production]
09:46 <liw@deploy1002> Finished scap: testwikis wikis to 1.36.0-wmf.33 (duration: 36m 20s) [production]
09:43 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker[1124-1128].eqiad.wmnet [production]
09:41 <elukey@cumin1001> START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1124-1128].eqiad.wmnet [production]
09:39 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker[1120-1123].eqiad.wmnet [production]
09:37 <elukey@cumin1001> START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker[1120-1123].eqiad.wmnet [production]
09:36 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1119.eqiad.wmnet [production]
09:33 <elukey@cumin1001> START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1119.eqiad.wmnet [production]
09:12 <liw@deploy1002> Started scap: testwikis wikis to 1.36.0-wmf.33 [production]
08:58 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1131.eqiad.wmnet with reason: REIMAGE [production]
08:56 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1130.eqiad.wmnet with reason: REIMAGE [production]
08:56 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1131.eqiad.wmnet with reason: REIMAGE [production]
08:54 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1130.eqiad.wmnet with reason: REIMAGE [production]
08:54 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1128.eqiad.wmnet with reason: REIMAGE [production]
08:53 <vgutierrez> rolling restart of ats-tls on ulsfo [production]
08:52 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1128.eqiad.wmnet with reason: REIMAGE [production]
08:39 <kharlan@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' . [production]
08:30 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1127.eqiad.wmnet with reason: REIMAGE [production]
08:28 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1126.eqiad.wmnet with reason: REIMAGE [production]
08:27 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1127.eqiad.wmnet with reason: REIMAGE [production]
08:25 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1126.eqiad.wmnet with reason: REIMAGE [production]
08:25 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1125.eqiad.wmnet with reason: REIMAGE [production]
08:23 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1125.eqiad.wmnet with reason: REIMAGE [production]
08:00 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1124.eqiad.wmnet with reason: REIMAGE [production]
07:59 <liw> 1.36.0-wmf.33 was branched at 800e1f8cea169fc9c6e72ac1dc197591a06701bd for T274937 [production]
07:58 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1123.eqiad.wmnet with reason: REIMAGE [production]
07:58 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1124.eqiad.wmnet with reason: REIMAGE [production]
07:56 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1122.eqiad.wmnet with reason: REIMAGE [production]
07:56 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1123.eqiad.wmnet with reason: REIMAGE [production]
07:54 <godog> swift eqiad-prod: add weight to ms-be106[0-3] - T268435 [production]
07:54 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1122.eqiad.wmnet with reason: REIMAGE [production]
07:28 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1121.eqiad.wmnet with reason: REIMAGE [production]
07:27 <ryankemper> Pooled `elastic106[0,4]` (Noticed I never re-pooled these hosts after resolving an incident last week) [production]
07:26 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1121.eqiad.wmnet with reason: REIMAGE [production]
07:26 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1120.eqiad.wmnet with reason: REIMAGE [production]
07:24 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1120.eqiad.wmnet with reason: REIMAGE [production]
05:40 <Amir1> apply gerrit:667757 on mwdebug1002 to test T259360 [production]
05:38 <marostegui@cumin1001> dbctl commit (dc=all): 'Pool db2152 into s8 as vslow - T275633', diff saved to https://phabricator.wikimedia.org/P14551 and previous config saved to /var/cache/conftool/dbconfig/20210302-053814-marostegui.json [production]