251-300 of 10000 results (29ms)
2021-04-27 §
03:27 <ryankemper> [WDQS Deploy] Gearing up for deploy of wdqs `0.3.70`. Pre-deploy tests passing on canary `wdqs1003` [production]
03:17 <ryankemper> T280382 `wdqs1006` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to raid0: `/dev/md2 2.6T 998G 1.5T 40% /srv` [production]
02:56 <ryankemper@cumin1001> END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) [production]
01:29 <ryankemper> T280382 `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1004.eqiad.wmnet --dest wdqs1006.eqiad.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph --task-id T280382` on `ryankemper@cumin1001` tmux session `reimage` [production]
01:29 <ryankemper@cumin1001> START - Cookbook sre.wdqs.data-transfer [production]
01:27 <ryankemper@cumin1001> END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) [production]
01:21 <ryankemper> T280382 `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1004.eqiad.wmnet --dest wdqs1006.eqiad.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin1001` tmux session `reimage` [production]
01:21 <ryankemper@cumin1001> START - Cookbook sre.wdqs.data-transfer [production]
2021-04-26 §
23:28 <mutante> renewing TLS cert for peopleweb.discovery.wmnet, adding *3 hosts [production]
23:21 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on people1003.eqiad.wmnet with reason: new host [production]
23:21 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on people1003.eqiad.wmnet with reason: new host [production]
22:26 <ryankemper@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1006.eqiad.wmnet with reason: REIMAGE [production]
22:24 <ryankemper@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1006.eqiad.wmnet with reason: REIMAGE [production]
22:11 <ryankemper> T280382 `sudo -i wmf-auto-reimage-host -p T280382 --new wdqs1006.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `reimage` [production]
21:21 <dzahn@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host people1003.eqiad.wmnet [production]
20:48 <twentyafterfour> restarting php-fpm on phab1001 to deploy phabricator hotfix d238db85b8d8072d99f31805aa4a8a7cf0c09941 [production]
20:35 <dzahn@cumin1001> START - Cookbook sre.ganeti.makevm for new host people1003.eqiad.wmnet [production]
20:26 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts planet1003.eqiad.wmnet [production]
20:15 <dzahn@cumin1001> START - Cookbook sre.hosts.decommission for hosts planet1003.eqiad.wmnet [production]
19:45 <legoktm> uploaded python3-falcon, python3-mimeparse, python3-mujson, openstack-pkg-tools to mailman3 component on apt.wm.o [production]
18:51 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: REIMAGE [production]
18:49 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1002.eqiad.wmnet with reason: REIMAGE [production]
18:49 <robh@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: REIMAGE [production]
18:47 <robh@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1001.eqiad.wmnet with reason: REIMAGE [production]
18:47 <robh@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1002.eqiad.wmnet with reason: REIMAGE [production]
18:45 <robh@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1001.eqiad.wmnet with reason: REIMAGE [production]
18:18 <urbanecm@deploy1002> Synchronized wmf-config/InitialiseSettings.php: 2d16f6251a67cf13cef02bbdcb3c9f5c1c505d16: elwiki: Update Growth experiments configuration (T280172) (duration: 00m 58s) [production]
18:06 <urbanecm@deploy1002> Synchronized multiversion/MWScript.php: 5ace4e1b806bcfc4ea059f9e9cae9aa94c0bdbd1: Fix error message if MWScript.php is run without arguments (duration: 00m 58s) [production]
17:28 <dduvall@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' . [production]
17:26 <dduvall@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' . [production]
17:18 <dduvall@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' . [production]
17:06 <legoktm> imported postorius_1.3.4-2~bpo10+2 to apt.wm.o [production]
16:49 <mutante> gerrit - restarted apache (hard) to remove time out from gerrit:682502 [production]
16:40 <mutante> gerrit1001 - reload apache2 [production]
16:36 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1025.eqiad.wmnet [production]
16:30 <jiji@cumin1001> START - Cookbook sre.hosts.reboot-single for host mc1025.eqiad.wmnet [production]
15:26 <jbond@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE [production]
15:24 <jbond@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE [production]
15:21 <elukey> restart zookeeper on conf2004 to pick up the -javaagent setting for the prometheus exporter [production]
15:06 <moritzm> installing jquery security updates on stretch [production]
15:01 <hnowlan@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' . [production]
15:01 <hnowlan@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . [production]
14:54 <hnowlan@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' . [production]
14:54 <hnowlan@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . [production]
14:48 <hnowlan@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' . [production]
14:47 <hnowlan@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'production' . [production]
14:28 <moritzm> installing ldap-replica1003/1004 [production]
14:03 <jayme@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on conf2001.codfw.wmnet with reason: for zookeeper migration [production]
14:03 <jayme@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on conf2001.codfw.wmnet with reason: for zookeeper migration [production]
13:39 <marostegui@cumin1001> dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 100%: Repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P15537 and previous config saved to /var/cache/conftool/dbconfig/20210426-133922-root.json [production]