2021-05-05
§
|
06:40 |
<marostegui> |
Check tables on db1112 (lag might show up on s3 on wiki replicas) T280492 |
[production] |
06:39 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1178 (re)pooling @ 3%: Slowly pool db1178 into s8 T275633', diff saved to https://phabricator.wikimedia.org/P15729 and previous config saved to /var/cache/conftool/dbconfig/20210505-063920-root.json |
[production] |
06:24 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1178 (re)pooling @ 2%: Slowly pool db1178 into s8 T275633', diff saved to https://phabricator.wikimedia.org/P15728 and previous config saved to /var/cache/conftool/dbconfig/20210505-062416-root.json |
[production] |
06:09 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1178 (re)pooling @ 1%: Slowly pool db1178 into s8 T275633', diff saved to https://phabricator.wikimedia.org/P15727 and previous config saved to /var/cache/conftool/dbconfig/20210505-060912-root.json |
[production] |
06:08 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Add db1178 into dbctl T275633', diff saved to https://phabricator.wikimedia.org/P15726 and previous config saved to /var/cache/conftool/dbconfig/20210505-060814-marostegui.json |
[production] |
06:06 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Remove db1104 from API', diff saved to https://phabricator.wikimedia.org/P15725 and previous config saved to /var/cache/conftool/dbconfig/20210505-060636-marostegui.json |
[production] |
06:00 |
<marostegui> |
Restart mysqld on x1 database primary master (db1103) T281212 |
[production] |
05:38 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1099:3311 into main traffic', diff saved to https://phabricator.wikimedia.org/P15724 and previous config saved to /var/cache/conftool/dbconfig/20210505-053841-marostegui.json |
[production] |
05:32 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1106 into s1 vslow, remove db1099:3311', diff saved to https://phabricator.wikimedia.org/P15723 and previous config saved to /var/cache/conftool/dbconfig/20210505-053211-marostegui.json |
[production] |
05:29 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1096:3316 for schema change', diff saved to https://phabricator.wikimedia.org/P15722 and previous config saved to /var/cache/conftool/dbconfig/20210505-052943-marostegui.json |
[production] |
04:53 |
<eileen> |
civicrm revision changed from e7c610fd87 to 8034e47008, config revision is 189788d452 |
[production] |
03:58 |
<ryankemper> |
T280563 `sudo -i cookbook sre.elasticsearch.rolling-operation search_codfw "codfw reboot" --reboot --nodes-per-run 3 --start-datetime 2021-04-29T23:04:29 --task-id T280563` on `ryankemper@cumin1001` tmux session `elastic_restarts` |
[production] |
03:58 |
<ryankemper@cumin1001> |
START - Cookbook sre.elasticsearch.rolling-operation reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw reboot - ryankemper@cumin1001 - T280563 |
[production] |
03:56 |
<ryankemper> |
T280563 Reboot of `eqiad` complete. Only ~half of `codfw` is remaining. |
[production] |
03:56 |
<ryankemper@cumin1001> |
END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad reboot to apply sec updates - ryankemper@cumin1001 - T280563 |
[production] |
03:54 |
<ryankemper> |
T280382 `wdqs1011.eqiad.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/mapper/vg0-srv 2.7T 998G 1.6T 39% /srv` |
[production] |
03:52 |
<ryankemper@cumin1001> |
START - Cookbook sre.elasticsearch.rolling-operation reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad reboot to apply sec updates - ryankemper@cumin1001 - T280563 |
[production] |
03:51 |
<ryankemper> |
T280382 [WDQS] `ryankemper@wdqs2007:~$ sudo depool` (need to monitor host to see if it becomes ssh unreachable again or if it was a one-off; also high update lag) |
[production] |
03:50 |
<ryankemper> |
T280382 `wdqs2007.codfw.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/mapper/vg0-srv 2.7T 998G 1.6T 39% /srv` |
[production] |
03:07 |
<ryankemper@cumin1001> |
END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) |
[production] |
03:02 |
<ryankemper@cumin1001> |
END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) |
[production] |
02:59 |
<ryankemper@cumin1001> |
END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad reboot to apply sec updates - ryankemper@cumin1001 - T280563 |
[production] |
01:55 |
<ryankemper> |
T281327 [Elastic] Unbanned `elastic2043` from cluster |
[production] |
01:50 |
<ryankemper@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
01:49 |
<ryankemper> |
T280382 `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2001.codfw.wmnet --dest wdqs2007.codfw.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin1001` tmux session `reimage` (will likely fail due to underlying hw but we'll see) |
[production] |
01:47 |
<ryankemper@cumin1001> |
END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) |
[production] |
01:45 |
<ryankemper> |
T280382 `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1006.eqiad.wmnet --dest wdqs1011.eqiad.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin1001` tmux session `reimage` |
[production] |
01:45 |
<ryankemper@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
01:44 |
<ryankemper@cumin1001> |
END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) |
[production] |
01:43 |
<ryankemper> |
T280382 [WDQS] `racadm>>racadm serveraction powercycle` on `wdqs2007` |
[production] |
01:39 |
<ryankemper> |
T280382 `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1006.eqiad.wmnet --dest wdqs1011.eqiad.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin1001` tmux session `reimage` |
[production] |
01:39 |
<ryankemper@cumin1001> |
START - Cookbook sre.wdqs.data-transfer |
[production] |
01:36 |
<ryankemper@cumin1001> |
START - Cookbook sre.elasticsearch.rolling-operation reboot without plugin upgrade (3 nodes at a time) for ElasticSearch cluster search_eqiad: eqiad reboot to apply sec updates - ryankemper@cumin1001 - T280563 |
[production] |
00:29 |
<eileen> |
civicrm revision changed from 94e321dbe0 to e7c610fd87, config revision is 189788d452 |
[production] |
00:15 |
<ejegg> |
updated payments-wiki from 44570561f2 to d449599540 |
[production] |
00:08 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: 3f6ea8c0e5a4dc667969f5847207902727625bbe: Growth: enwiki: Add list of mentors (T281896) (duration: 01m 10s) |
[production] |
00:00 |
<urbanecm@deploy1002> |
Synchronized fc-list: 93970496da7678d896b7f812b3bb5f4cf0b691ad: update fc-list to current version on buster (T79424) (duration: 01m 09s) |
[production] |
2021-05-04
§
|
23:41 |
<urbanecm@deploy1002> |
Synchronized wmf-config/config/enwiki.yaml: d29dbb2f435afe64f2fee15b430ee04d5d13c8d7: Enable Growth features on enwiki in the dark mode (T281896; 3/3) (duration: 01m 09s) |
[production] |
23:40 |
<urbanecm@deploy1002> |
Synchronized dblists/growthexperiments.dblist: d29dbb2f435afe64f2fee15b430ee04d5d13c8d7: Enable Growth features on enwiki in the dark mode (T281896; 2/3) (duration: 01m 09s) |
[production] |
23:38 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: d29dbb2f435afe64f2fee15b430ee04d5d13c8d7: Enable Growth features on enwiki in the dark mode (T281896; 1/3) (duration: 01m 09s) |
[production] |
23:31 |
<urbanecm@deploy1002> |
Synchronized wmf-config/config/bgwiki.yaml: 5b4c516a1d0461065e27cacec5d2b1cb315a2c07: Enable Growth team features in dark mode on bgwiki (T280824; 3/3) (duration: 01m 09s) |
[production] |
23:30 |
<urbanecm@deploy1002> |
sync-file aborted: 5b4c516a1d0461065e27cacec5d2b1cb315a2c07: Enable Growth team features in dark mode on bgwiki (T280824; 3/3) (duration: 00m 03s) |
[production] |
23:30 |
<urbanecm@deploy1002> |
Synchronized dblists/growthexperiments.dblist: 5b4c516a1d0461065e27cacec5d2b1cb315a2c07: Enable Growth team features in dark mode on bgwiki (T280824; 2/3) (duration: 01m 09s) |
[production] |
23:28 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: 5b4c516a1d0461065e27cacec5d2b1cb315a2c07: Enable Growth team features in dark mode on bgwiki (T280824; 1/3) (duration: 01m 09s) |
[production] |
23:26 |
<Urbanecm> |
Create tables for GrowthExperiments extension on enwiki (T281896) |
[production] |
23:24 |
<Urbanecm> |
Create tables for GrowthExperiments extension on bgwiki (T280824) |
[production] |
23:22 |
<urbanecm@deploy1002> |
Synchronized wmf-config/CommonSettings.php: a3c24f322b754c9a94c260ee5df4b5ae4de27f22: Avoid using User::getGroups() and ::getEffectiveGroups() (T281823) (duration: 01m 10s) |
[production] |
23:13 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: e467d92e5e257a3d2f9b05692db9accdd86ddb00: Add extendedconfirmed on ptwiki (T281926) (duration: 01m 10s) |
[production] |
23:06 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: 012d6138741ea76c985453428111aeddfdec2271: Add extendedconfirmed on azwiki (T281860) (duration: 01m 10s) |
[production] |
22:49 |
<bblack@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cp5016.eqsin.wmnet with reason: REIMAGE |
[production] |