2021-09-20
§
|
09:48 |
<hnowlan@cumin1001> |
START - Cookbook sre.postgresql.postgres-init |
[production] |
09:47 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Remove s10 from eqiad T167973', diff saved to https://phabricator.wikimedia.org/P17300 and previous config saved to /var/cache/conftool/dbconfig/20210920-094739-marostegui.json |
[production] |
09:10 |
<moritzm> |
installing openssl1.0 updates for stretch with backport for forthcoming Let's encrypt issuance chain update (T283165) |
[production] |
08:35 |
<moritzm> |
updating clamav on ticket.wikimedia.org/otrs1001 to 0.103.3 |
[production] |
08:12 |
<elukey> |
remove old /reportcard (password protected, old files from 2012) httpd settings for stats.wikimedia.org |
[analytics] |
08:02 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
07:58 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
07:58 |
<oblivian@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
07:49 |
<moritzm> |
uploaded maps-deduped-tilelist 0.0.3~deb10u1 to buster-wikimedia/main T290982 |
[production] |
07:48 |
<moritzm> |
uploaded maps-deduped-tilelist 0.0.3~deb10u1 to buster-wikimedia/main |
[production] |
07:48 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
07:43 |
<oblivian@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
07:43 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
07:35 |
<marostegui> |
Stop db1168 and db2129 in sync T167973 |
[production] |
07:34 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
07:34 |
<urbanecm@deploy1002> |
Synchronized wmf-config/throttle.php: af9d6e4e29e5f53ad8cf5aa2c235d54500c433bd: Revert "Add throttle rule for Czech wiki course" (duration: 00m 56s) |
[production] |
07:32 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1168 T167973', diff saved to https://phabricator.wikimedia.org/P17299 and previous config saved to /var/cache/conftool/dbconfig/20210920-073256-marostegui.json |
[production] |
07:32 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1096:3316 T167973', diff saved to https://phabricator.wikimedia.org/P17298 and previous config saved to /var/cache/conftool/dbconfig/20210920-073206-marostegui.json |
[production] |
07:31 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1096:3316 T167973', diff saved to https://phabricator.wikimedia.org/P17297 and previous config saved to /var/cache/conftool/dbconfig/20210920-073141-marostegui.json |
[production] |
07:31 |
<moritzm> |
uploaded PHP 7.2.34-18+0~20210223.60+debian10~1.gbpb21322+wmf2 to apt.wikimedia.org (component/php7.2 for buster-wikimedia) T291052 |
[production] |
07:29 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
07:28 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: 8c1d665b5e83f6b1dd1cc4a9c367cb6881473bba: enwiki: Bump Growth features to 25% (mentorship limited to 20% of those users) (T290927) (duration: 00m 57s) |
[production] |
07:20 |
<urbanecm> |
Revert undeployed config patch (https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/721959); not even pulled to deployment, so assuming it never hit prod (T289771) |
[production] |
06:00 |
<marostegui> |
Upgrade db2071, db2072, db2094 |
[production] |
2021-09-19
§
|
16:47 |
<wm-bot> |
<lucaswerkmeister> changed restart cronjob to 9:30 and 21:30 to avoid congestion at midnight/noon |
[tools.notwikilambda] |
16:39 |
<wm-bot> |
<lucaswerkmeister> (and reenabled SyntaxHighlight_GeSHi) |
[tools.notwikilambda] |
16:39 |
<wm-bot> |
<lucaswerkmeister> rolled back SyntaxHighlight_GeSHi to commit 580ce3425f, because I2e82e5aa2a is incompatible with pygments-server (provides input as file instead of stdin) |
[tools.notwikilambda] |
15:48 |
<wm-bot> |
<lucaswerkmeister> temporarily disabled SyntaxHighlight extension due to issues with the Pygments server |
[tools.notwikilambda] |
15:34 |
<wm-bot> |
<lucaswerkmeister> improved automatic updates of function orchestrator and evaluator, should work without downtime now |
[tools.notwikilambda] |
13:49 |
<wm-bot> |
<lucaswerkmeister> deployed 3c1b6e0810 (readinessProbe → startupProbe to avoid bloating access log); deployed by adding readinessProbe: null to the patch file and patching the deployment with that |
[tools.lexeme-forms] |
10:47 |
<wm-bot> |
<lucaswerkmeister> configured $wgFooterIcons if notwikilambda-configure-wgFooterIcons URL parameter isn’t set to false, working around T291325 |
[tools.notwikilambda] |
2021-09-17
§
|
21:28 |
<legoktm@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
21:19 |
<legoktm@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
19:05 |
<hashar> |
Building Docker images for [tox-buster] Install shellcheck and cascade [integration/config] - https://gerrit.wikimedia.org/r/721881 |
[releng] |
19:00 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0) |
[production] |
18:08 |
<Krinkle> |
Re-recreating qemu-1002 as integration-agent-qemu-1003 (Debian 11 Bullseye, g3.cores8.ram24.disk20.ephemeral40.4xiops), ref T284774 |
[releng] |
18:07 |
<Krinkle> |
Re-recreating qemu-1002 as integration-agent-qemu-1003 (Debian 11 Bullseye, g3.cores8.ram24.disk20.ephemeral40.4xiops), ref T28477 |
[releng] |
17:02 |
<cmjohnson@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1022.eqiad.wmnet with reason: REIMAGE |
[production] |
17:02 |
<hnowlan@cumin1001> |
START - Cookbook sre.postgresql.postgres-init |
[production] |
17:00 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1022.eqiad.wmnet with reason: REIMAGE |
[production] |
16:48 |
<hnowlan@cumin1001> |
END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0) |
[production] |
16:27 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
16:25 |
<cmjohnson@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
16:19 |
<dpifke> |
Enabled TLS on Jumbo Kafka instances in deployment-prep. |
[releng] |
16:11 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |