4601-4650 of 10000 results (33ms)
2021-01-05 ยง
21:12 <razzi@cumin1001> END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) [production]
21:02 <razzi@cumin1001> START - Cookbook sre.aqs.roll-restart [production]
20:53 <razzi@deploy1001> Finished deploy [analytics/aqs/deploy@5d05f83]: Configure http request timeout and caching for T268809 (duration: 04m 48s) [production]
20:50 <jhuneidi@deploy1001> rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.25 refs T267418 [production]
20:48 <razzi@deploy1001> Started deploy [analytics/aqs/deploy@5d05f83]: Configure http request timeout and caching for T268809 [production]
20:45 <ottomata> Refine changes: event tables now have is_wmf_domain, canary events are removed, and corrupt records will result in a better monitoring email [analytics]
20:44 <razzi> deploy aqs (analytics query service) as part of analytics train [production]
20:43 <razzi> deploy aqs as part of train [analytics]
20:38 <rzl> rzl@mw1362:~$ sudo -i /usr/local/sbin/restart-php7.2-fpm [production]
20:28 <mutante> repooled mw1362 [production]
20:20 <mutante> mw1344 - /usr/local/sbin/restart-php7.2-fpm [production]
20:04 <mutante> mw1344 - restarted apache2 - it was showing the same "partial results" error a mw1362 - no other appservers are showing up in logstash, but these were #1 and #2 source of errors [production]
19:47 <mutante> depooled mw1362 [production]
19:41 <mutante> mw1362 - restarted apache2 [production]
19:29 <razzi@deploy1001> Finished deploy [analytics/refinery@56fb3ff] (thin): Regular analytics weekly train THIN [analytics/refinery@6ce68c950fc339dc3748cf50e6925cd1031287c4] (duration: 00m 08s) [production]
19:29 <razzi@deploy1001> Started deploy [analytics/refinery@56fb3ff] (thin): Regular analytics weekly train THIN [analytics/refinery@6ce68c950fc339dc3748cf50e6925cd1031287c4] [production]
19:28 <razzi@deploy1001> Finished deploy [analytics/refinery@56fb3ff]: Regular analytics weekly train [analytics/refinery@6ce68c950fc339dc3748cf50e6925cd1031287c4] (duration: 09m 37s) [production]
19:19 <razzi@deploy1001> Started deploy [analytics/refinery@56fb3ff]: Regular analytics weekly train [analytics/refinery@6ce68c950fc339dc3748cf50e6925cd1031287c4] [production]
19:17 <razzi> deploying refinery for weekly train [production]
19:17 <razzi> deploying refinery for weekly train [analytics]
19:16 <mutante> mwdebug1003 - editing apache2 defaults conf and dropping ServerAdmin address.restarting [production]
18:59 <jhuneidi@deploy1001> Finished scap: testwikis wikis to 1.36.0-wmf.25 refs T267418 (duration: 39m 07s) [production]
18:49 <bstorm> changing the limits on k8s etcd nodes again, so disabling puppet on them T267966 [tools]
18:22 <jhuneidi@deploy1001> Started scap: testwikis wikis to 1.36.0-wmf.25 refs T267418 [production]
18:21 <mbsantos@deploy1001> helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [production]
18:18 <mbsantos@deploy1001> helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [production]
18:13 <elukey> run homer on cr1/cr2-eqiad to update the analytics-in4 filter (https://gerrit.wikimedia.org/r/c/operations/homer/public/+/654469) [production]
18:08 <mbsantos@deploy1001> helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' . [production]
17:10 <longma> 1.36.0-wmf.25 was branched at 083fd09afcd204cfef177e11d7a5e4fd1217acfc for T267418 [production]
17:00 <XioNoX> capture packets on pfw3-eqiad:reth0.1134 - T263833 [production]
15:50 <jbond42> merging puppetlabs-lvm update [production]
15:41 <volans> upgraded wmflib to 0.0.6 on all hosts where it's installed - T257905 [production]
15:37 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2025.codfw.wmnet with reason: REIMAGE [production]
15:35 <jiji@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mc2025.codfw.wmnet with reason: REIMAGE [production]
15:35 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc1025.eqiad.wmnet with reason: REIMAGE [production]
15:33 <jiji@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mc1025.eqiad.wmnet with reason: REIMAGE [production]
14:59 <otto@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Remove overrides from wgEventLoggingSchemas (duration: 00m 57s) [production]
14:11 <dcaro> finished ssl tests for enc, cleaned up cloud-puppetmaster-03 (T268877) [cloudinfra]
13:40 <moritzm> installing python-apt security updates on buster/stretch [production]
13:29 <moritzm> installing xen security updates on buster [production]
13:07 <dcaro> adding custom nginx config for labspuppetbackend on cloud-puppetmaster-03 to test ssl (T268877) [cloudinfra]
13:01 <moritzm> installing lxml security updates for stretch [production]
12:48 <elukey> add PXE d-i rescue bootable image config for jessie/stretch/buster to tftp [production]
12:43 <jmm@cumin2001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
12:41 <arturo> live-hacking cloudinfra-internal-puppetmaster02 with https://gerrit.wikimedia.org/r/c/operations/puppet/+/654415 (T260834) [cloudinfra]
12:31 <arturo> refresh acme-chief config for mx certs https://gerrit.wikimedia.org/r/plugins/gitiles/cloud/instance-puppet/+/949f1b4e81f3a1c6d4f4825292343f1ee17c48a1%5E%21/ (T260834) [cloudinfra]
12:29 <jmm@cumin2001> START - Cookbook sre.dns.netbox [production]
12:21 <arturo> resolve git merge conflicts and rebase cloudinfra-internal-puppetmaster-02 /var/lib/git/labs/private [cloudinfra]
12:13 <sukhe@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on malmok.wikimedia.org with reason: rebooting for kernel update [production]
12:13 <sukhe@cumin1001> START - Cookbook sre.hosts.downtime for 0:10:00 on malmok.wikimedia.org with reason: rebooting for kernel update [production]