1-50 of 10000 results (23ms)
2021-08-27 §
16:46 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master [production]
16:46 <hnowlan@cumin1001> START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master [production]
14:50 <akosiaris> stop flink on staging cluster to verify some IOPS starvation issues [production]
14:46 <akosiaris@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'sync'. [production]
14:45 <akosiaris@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'sync'. [production]
14:44 <akosiaris@deploy1002> helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'. [production]
14:44 <akosiaris@deploy1002> helmfile [staging-eqiad] START helmfile.d/admin 'sync'. [production]
14:44 <akosiaris@deploy1002> helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'. [production]
14:44 <akosiaris@deploy1002> helmfile [staging-eqiad] START helmfile.d/admin 'sync'. [production]
14:39 <hnowlan@puppetmaster1001> conftool action : set/pooled=no; selector: name=maps1005.eqiad.wmnet [production]
14:38 <hnowlan@cumin1001> END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99) [production]
14:37 <hnowlan@cumin1001> START - Cookbook sre.postgresql.postgres-init [production]
14:30 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master [production]
14:30 <hnowlan@cumin1001> START - Cookbook sre.hosts.downtime for 4:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master [production]
13:48 <dzahn@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' . [production]
12:49 <mutante> rsynced /srv/org/wikimedia/racktables from miscweb1002 to miscweb2002 (T269746) [production]
12:04 <topranks> removing peering to Wave Division Holdings / AS11404 at Equinix Chicago cr2-eqord, AS no longer on exchange. [production]
10:56 <akosiaris> sudo cumin 'mw*' 'ip ro ls dev docker0 && sysctl net.ipv4.ip_forward=0' to clear up the docker remnants of the dragonfly evaluation. T286054 [production]
10:31 <godog> bounce logstash on logstash1007 [production]
10:22 <elukey> fallback codfw ores to rdb2007 after maintenance [production]
10:18 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2007.codfw.wmnet [production]
10:12 <jiji@cumin1001> START - Cookbook sre.hosts.reboot-single for host rdb2007.codfw.wmnet [production]
09:49 <elukey> restart ores uwsgi/celery workers to failover rdb2007 to rdb2008 (and ease the reboot of rdb2007 [production]
09:33 <topranks> Running homer against mr1-ulsfo to force OOB interface to 100Mb/full-duplex - T288343 [production]
09:25 <cmooney@cumin1001> END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: Update to expose int type from Netbox - cmooney@cumin1001 [production]
09:25 <cmooney@cumin1001> START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: Update to expose int type from Netbox - cmooney@cumin1001 [production]
09:23 <cmooney@deploy1002> Finished deploy [homer/deploy@8183056]: Homer update exposing interface type from Netbox - T288343 (duration: 01m 28s) [production]
09:21 <cmooney@deploy1002> Started deploy [homer/deploy@8183056]: Homer update exposing interface type from Netbox - T288343 [production]
08:05 <tstarling@deploy1002> Synchronized php-1.37.0-wmf.20/extensions/SecurePoll/cli/wm-scripts/sendMail.php: (no justification provided) (duration: 00m 56s) [production]
07:49 <jayme> stopped kube-apiserver on kubestagemaster2001 for testing [production]
07:49 <jayme> stopped kube-apiserver on kubestage2001 for testing [production]
07:00 <godog> bounce logstash on logstash1008 [production]
06:43 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
06:41 <tstarling@deploy1002> Synchronized php-1.37.0-wmf.20/extensions/SecurePoll/cli/wm-scripts/sendMail.php: (no justification provided) (duration: 00m 56s) [production]
06:41 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
00:46 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
00:44 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
00:44 <legoktm@deploy1002> Synchronized php-1.37.0-wmf.20/extensions/PageTriage/: Revert backbone.js and underscore.js updates (T289825) (duration: 01m 06s) [production]
2021-08-26 §
22:06 <legoktm> restarted mailman3-web on lists1001 (T289798) [production]
19:09 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
19:08 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
19:02 <dancy@deploy1002> rebuilt and synchronized wikiversions files: group2 wikis to 1.37.0-wmf.20 [production]
18:59 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
18:54 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
18:26 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
18:24 <mwdebug-deploy@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
18:19 <urbanecm@deploy1002> Synchronized wmf-config/InitialiseSettings.php: 66717bc039f40336144dcc0dfd97ff5331b418e9: Install Extension Quiz on ja.wikibooks (T289383) (duration: 01m 05s) [production]
18:17 <mwdebug-deploy@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . [production]
18:16 <sukhe@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on durum1001.eqiad.wmnet with reason: testing out durum [production]
18:16 <sukhe@cumin1001> START - Cookbook sre.hosts.downtime for 0:30:00 on durum1001.eqiad.wmnet with reason: testing out durum [production]