251-300 of 10000 results (30ms)
2021-07-14 ยง
17:39 <razzi@cumin1001> START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid's jvm daemons. - razzi@cumin1001 [production]
17:35 <root@cumin2002> START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 1733 hosts [production]
17:35 <dancy@deploy1002> Synchronized php-1.37.0-wmf.14/extensions/CentralAuth/includes/specials/SpecialCentralAutoLogin.php: Backport: [[gerrit:704383|Do not lock preferences row for a rememberpassword check (T286521)]] (duration: 01m 06s) [production]
17:00 <dancy@deploy1002> Synchronized php-1.37.0-wmf.12/extensions/CentralAuth/includes/specials/SpecialCentralAutoLogin.php: Backport: [[gerrit:704382|Do not lock preferences row for a rememberpassword check (T286521)]] (duration: 01m 05s) [production]
16:27 <root@cumin2002> END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 1733 hosts [production]
16:26 <root@cumin2002> START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 1733 hosts [production]
16:11 <dancy@deploy1002> Synchronized php-1.37.0-wmf.12/extensions/Translate: Backport: [[gerrit:704404|TranslationAid: Handle empty message definition (T285830)]] and [[gerrit:704405|TranslationAid: Make sure to return successfully fetched definitions (T285830)]] (duration: 01m 09s) [production]
16:07 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' . [production]
15:37 <moritzm> installing klibc security updates [production]
15:36 <ottomata> deploying eventgate-analytics with direct service-runner promethues support [production]
15:34 <ryankemper> [Elastic] Manually triggering readahead mitigation across whole fleet to prevent any further issues today: `ryankemper@cumin1001:~$ sudo cumin -b 12 'P{elastic*}' 'sudo systemctl restart elasticsearch-disable-readahead.service'` (still need to investigate why `elasticsearch-disable-readahead.timer` isn't re-firing every 30 mins as desired) [production]
15:34 <moritzm> installing apache security updates on otrs1001 (ticket.wikimedia.org) [production]
15:34 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' . [production]
15:28 <urbanecm> Start server-side upload of 3 large image files (T285708) [production]
15:16 <moritzm> installing apache security updates on lists1001 (lists.wikimedia.org) [production]
14:51 <moritzm> installing apache security updates on puppet masters [production]
14:47 <jiji@cumin1001> conftool action : set/pooled=inactive; selector: name=mw2384.codfw.wmnet [production]
14:47 <effie> set mw2384 as inactive to investigate mw2383 issue - T286463 [production]
14:44 <jiji@deploy1002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
14:44 <moritzm> installing apache security updates on grafana* [production]
14:43 <jiji@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
14:43 <jiji@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
14:40 <jiji@deploy1002> helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. [production]
14:40 <jiji@deploy1002> helmfile [staging-eqiad] START helmfile.d/admin 'apply'. [production]
14:38 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1422.eqiad.wmnet [production]
14:33 <dcausse> runnning elasticsearch-madvise-random ES_PID on elastic2045 [production]
14:31 <dcausse> runnning elasticsearch-madvise-random 1022 on elastic2054 [production]
14:23 <jiji@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
14:19 <jiji@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
14:19 <jiji@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
14:19 <jiji@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
14:13 <elukey> restart php-fpm on mw2370 [production]
13:43 <jmm@cumin2002> END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 1733 hosts [production]
13:43 <jmm@cumin2002> START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 1733 hosts [production]
13:09 <kormat@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 18 hosts with reason: Deploying schema change to s1 T277118 [production]
13:09 <kormat@cumin1001> START - Cookbook sre.hosts.downtime for 4:00:00 on 18 hosts with reason: Deploying schema change to s1 T277118 [production]
12:47 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb1005.eqiad.wmnet [production]
12:43 <urbanecm> Start server-side upload of 3 large image files (T285708) [production]
12:37 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host rdb1005.eqiad.wmnet [production]
12:24 <jmm@cumin2002> END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 1733 hosts [production]
12:23 <jmm@cumin2002> START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 1733 hosts [production]
12:15 <mutante> mw1422 - scap pull [production]
12:09 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1422.eqiad.wmnet [production]
12:02 <moritzm> upgrading python3-wmflib fleetwide to 0.0.8 (needed for new logout.d wrapper) [production]
12:01 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on maps2008.codfw.wmnet with reason: Bootstrapping cassandra in new cluster [production]
12:01 <hnowlan@cumin1001> START - Cookbook sre.hosts.downtime for 3:00:00 on maps2008.codfw.wmnet with reason: Bootstrapping cassandra in new cluster [production]
11:52 <mutante> mw1422 - new setup, not in prod yet [production]
11:52 <jmm@cumin2002> END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 1733 hosts [production]
11:52 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw1422.eqiad.wmnet with reason: new host [production]
11:52 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw1422.eqiad.wmnet with reason: new host [production]