951-1000 of 10000 results (53ms)
2022-08-05 §
00:53 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for gerrit2002.wikimedia.org [production]
00:53 <dzahn@cumin1001> START - Cookbook sre.hosts.remove-downtime for gerrit2002.wikimedia.org [production]
00:52 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8 days, 0:00:00 on gerrit2002.wikimedia.org with reason: decom, replaced by gerrit2002 [production]
00:52 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 8 days, 0:00:00 on gerrit2002.wikimedia.org with reason: decom, replaced by gerrit2002 [production]
00:18 <mutante> restarting gerrit for config change - removing old replica T313250 [production]
2022-08-04 §
23:06 <mutante> switching gerrit-replica.wikimedia.org to new machine gerrit2002, dropping gerrit-replica-new.wikimedia.org T313250 [production]
21:07 <ryankemper@deploy1002> helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
20:59 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
20:57 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
20:57 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
20:56 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
20:56 <thcipriani@deploy1002> Finished scap: Backport for [[gerrit:819774]] tkwiki: Update wordmark (duration: 06m 12s) [production]
20:51 <ryankemper@deploy1002> helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply [production]
20:51 <ryankemper@deploy1002> helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
20:51 <ryankemper@deploy1002> helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply [production]
20:50 <thcipriani@deploy1002> Started scap: Backport for [[gerrit:819774]] tkwiki: Update wordmark [production]
20:48 <thcipriani@deploy1002> Finished scap: Backport for [[gerrit:812391]] [config]: Add click event logging for mobile and desktop (duration: 39m 16s) [production]
20:45 <ryankemper@deploy1002> helmfile [codfw] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
20:24 <ryankemper@deploy1002> helmfile [codfw] START helmfile.d/services/changeprop-jobqueue: apply [production]
20:23 <ryankemper@deploy1002> helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
20:22 <ryankemper@deploy1002> helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply [production]
20:16 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
20:15 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
20:15 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
20:14 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
20:13 <ryankemper@deploy1002> helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
20:13 <ryankemper@deploy1002> helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply [production]
20:10 <ryankemper@deploy1002> helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply [production]
20:09 <ryankemper@deploy1002> helmfile [staging] DONE helmfile.d/services/changeprop-jobqueue: apply [production]
20:08 <thcipriani@deploy1002> Started scap: Backport for [[gerrit:812391]] [config]: Add click event logging for mobile and desktop [production]
19:59 <ryankemper@deploy1002> helmfile [staging] START helmfile.d/services/changeprop-jobqueue: apply [production]
19:55 <dancy@deploy1002> rebuilt and synchronized wikiversions files: resync [production]
19:49 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for thanos-be2001.codfw.wmnet [production]
19:49 <mvernon@cumin1001> START - Cookbook sre.hosts.remove-downtime for thanos-be2001.codfw.wmnet [production]
19:44 <mvernon@cumin1001> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 8 hosts [production]
19:44 <mvernon@cumin1001> START - Cookbook sre.hosts.remove-downtime for 8 hosts [production]
19:42 <Emperor> rebooting thanos-be2001 to fix drive ordering [production]
19:37 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for elastic2071.codfw.wmnet [production]
19:37 <bking@cumin1001> START - Cookbook sre.hosts.remove-downtime for elastic2071.codfw.wmnet [production]
19:31 <bking@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on elastic2071.codfw.wmnet with reason: T310146 [production]
19:31 <bking@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on elastic2071.codfw.wmnet with reason: T310146 [production]
19:13 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
19:12 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
19:12 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
19:12 <ryankemper@deploy1002> helmfile [eqiad] DONE helmfile.d/services/changeprop: apply [production]
19:11 <ryankemper@deploy1002> helmfile [eqiad] START helmfile.d/services/changeprop: apply [production]
19:11 <dancy> There were many errors during php-fpm restart due to failure to contact http://lvs2009:9090/pools/appservers-https_443/mw2361.codfw.wmnet and the like. [production]
19:11 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
19:10 <dancy@deploy1002> rebuilt and synchronized wikiversions files: group2 wikis to 1.39.0-wmf.23 refs T308076 [production]
19:09 <ryankemper@deploy1002> helmfile [codfw] DONE helmfile.d/services/changeprop: apply [production]