2023-07-27
§
|
09:07 |
<fabfur@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host lvs1019.eqiad.wmnet |
[production] |
08:42 |
<fabfur> |
begin restarting lvs1019 (T335835) |
[production] |
08:34 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-main-eqiad cluster: Roll restart of jvm daemons. |
[production] |
08:15 |
<jnuche@deploy1002> |
rebuilt and synchronized wikiversions files: group2 wikis to 1.41.0-wmf.19 refs T340247 |
[production] |
07:54 |
<oblivian@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mw-misc: apply |
[production] |
07:54 |
<oblivian@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mw-misc: apply |
[production] |
07:54 |
<oblivian@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mw-misc: apply |
[production] |
07:54 |
<oblivian@deploy1002> |
helmfile [codfw] START helmfile.d/services/mw-misc: apply |
[production] |
07:40 |
<XioNoX> |
reboot lsw1-a1-codfw (test device) |
[production] |
06:53 |
<elukey@cumin1001> |
START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-main-eqiad cluster: Roll restart of jvm daemons. |
[production] |
06:39 |
<isaranto@deploy1002> |
helmfile [ml-serve-eqiad] 'sync' command on namespace 'ores-legacy' for release 'main' . |
[production] |
06:38 |
<isaranto@deploy1002> |
helmfile [ml-serve-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' . |
[production] |
06:36 |
<isaranto@deploy1002> |
helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' . |
[production] |
06:03 |
<oblivian@deploy1002> |
helmfile [eqiad] DONE helmfile.d/services/mw-misc: apply |
[production] |
05:57 |
<oblivian@deploy1002> |
helmfile [eqiad] START helmfile.d/services/mw-misc: apply |
[production] |
05:45 |
<oblivian@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/mw-misc: apply |
[production] |
05:40 |
<oblivian@deploy1002> |
helmfile [codfw] START helmfile.d/services/mw-misc: apply |
[production] |
05:26 |
<oblivian@deploy1002> |
Started scap: (no justification provided) |
[production] |
05:26 |
<_joe_> |
scap is not syncing; just rebuilding the image from scratch to verify the reason for a bug. |
[production] |
05:22 |
<oblivian@deploy1002> |
Started scap: (no justification provided) |
[production] |
03:19 |
<cstone> |
payments-wiki upgraded from 2a68dfe2 to 1a6ca7ab |
[production] |
03:04 |
<eileen> |
civicrm upgraded from 5a84b138 to 16c2e58a |
[production] |
00:54 |
<eileen> |
civicrm upgraded from 68f29b70 to 5a84b138 |
[production] |
00:51 |
<eileen> |
civicrm upgraded from 853c14f3 to 68f29b70 |
[production] |
00:20 |
<eileen> |
rollback because I got an error when I tried to view - so let's see |
[production] |
00:20 |
<eileen> |
civicrm rolled back from 68f29b70 to 853c14f3 (locked) |
[production] |
00:17 |
<eileen> |
civicrm upgraded from 853c14f3 to 68f29b70 |
[production] |
2023-07-26
§
|
23:01 |
<jforrester@deploy1002> |
Synchronized wmf-config/interwiki.php: Update interwiki cache now that wikifunctions is here (duration: 06m 52s) |
[production] |
21:53 |
<bking@cumin1001> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wcqs2001.codfw.wmnet |
[production] |
21:46 |
<bking@cumin1001> |
START - Cookbook sre.hosts.reboot-single for host wcqs2001.codfw.wmnet |
[production] |
21:23 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2180 (T342617)', diff saved to https://phabricator.wikimedia.org/P49745 and previous config saved to /var/cache/conftool/dbconfig/20230726-212310-ladsgroup.json |
[production] |
21:08 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P49744 and previous config saved to /var/cache/conftool/dbconfig/20230726-210804-ladsgroup.json |
[production] |
21:04 |
<jhancock@cumin2002> |
END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host rdb1013.eqiad.wmnet with OS bullseye |
[production] |
21:04 |
<jhancock@cumin2002> |
START - Cookbook sre.hosts.reimage for host rdb1013.eqiad.wmnet with OS bullseye |
[production] |
21:00 |
<taavi> |
manually attach User:WikiLambda_system to SUL T342811 |
[production] |
20:52 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2180', diff saved to https://phabricator.wikimedia.org/P49743 and previous config saved to /var/cache/conftool/dbconfig/20230726-205257-ladsgroup.json |
[production] |
20:37 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2180 (T342617)', diff saved to https://phabricator.wikimedia.org/P49742 and previous config saved to /var/cache/conftool/dbconfig/20230726-203751-ladsgroup.json |
[production] |
20:34 |
<taavi@deploy1002> |
Finished scap: Backport for [[gerrit:941954|clienthints: Start collecting client hints data on testwiki (T341110)]], [[gerrit:941021|CheckUser event table migration: Write new on group0 (T330158)]] (duration: 26m 17s) |
[production] |
20:15 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Depooling db2180 (T342617)', diff saved to https://phabricator.wikimedia.org/P49741 and previous config saved to /var/cache/conftool/dbconfig/20230726-201554-ladsgroup.json |
[production] |
20:15 |
<ladsgroup@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance |
[production] |
20:15 |
<ladsgroup@cumin1001> |
START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance |
[production] |
20:15 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T342617)', diff saved to https://phabricator.wikimedia.org/P49740 and previous config saved to /var/cache/conftool/dbconfig/20230726-201533-ladsgroup.json |
[production] |
20:09 |
<taavi@deploy1002> |
dreamyjazz and taavi: Backport for [[gerrit:941954|clienthints: Start collecting client hints data on testwiki (T341110)]], [[gerrit:941021|CheckUser event table migration: Write new on group0 (T330158)]] synced to the testservers mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD |
[production] |
20:08 |
<taavi@deploy1002> |
Started scap: Backport for [[gerrit:941954|clienthints: Start collecting client hints data on testwiki (T341110)]], [[gerrit:941021|CheckUser event table migration: Write new on group0 (T330158)]] |
[production] |
20:00 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P49739 and previous config saved to /var/cache/conftool/dbconfig/20230726-200026-ladsgroup.json |
[production] |
19:45 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2171:3316', diff saved to https://phabricator.wikimedia.org/P49738 and previous config saved to /var/cache/conftool/dbconfig/20230726-194520-ladsgroup.json |
[production] |
19:30 |
<ladsgroup@cumin1001> |
dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T342617)', diff saved to https://phabricator.wikimedia.org/P49737 and previous config saved to /var/cache/conftool/dbconfig/20230726-193014-ladsgroup.json |
[production] |
18:48 |
<jforrester@deploy1002> |
helmfile [staging] DONE helmfile.d/services/wikifunctions: apply |
[production] |
18:47 |
<jforrester@deploy1002> |
helmfile [staging] START helmfile.d/services/wikifunctions: apply |
[production] |
18:45 |
<jforrester@deploy1002> |
helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply |
[production] |