51-100 of 10000 results (25ms)
2021-07-27 ยง
14:47 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve2001.codfw.wmnet [production]
14:47 <mmandere@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp[1079-1082].eqiad.wmnet with reason: Eqiad row B maintenance [production]
14:46 <mmandere@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on cp[1079-1082].eqiad.wmnet with reason: Eqiad row B maintenance [production]
14:45 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet [production]
14:43 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1129.eqiad.wmnet with reason: REIMAGE [production]
14:41 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet [production]
14:40 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on db1129.eqiad.wmnet with reason: REIMAGE [production]
14:40 <mmandere> depool authdns1001 - T286061 [production]
14:40 <elukey@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ml-serve-ctrl2002.codfw.wmnet [production]
14:36 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet [production]
14:34 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve-ctrl2002.codfw.wmnet [production]
14:33 <mmandere> depool cp10[79-82]).eqiad.wmnet - T286061 [production]
14:33 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve-ctrl2001.codfw.wmnet [production]
14:30 <topranks> Add peering to AS398196 - Cobalt Ridge at DE-CIX Dallas on cr2-codfw. [production]
14:29 <elukey> reduce vcores for ml-serve-ctrl[12]00[12] after performance testing - T287238 [production]
14:28 <elukey@cumin1001> START - Cookbook sre.hosts.reboot-single for host ml-serve-ctrl2001.codfw.wmnet [production]
14:25 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1129 T287230', diff saved to https://phabricator.wikimedia.org/P16916 and previous config saved to /var/cache/conftool/dbconfig/20210727-142520-marostegui.json [production]
14:19 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
14:16 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . [production]
14:13 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
14:13 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . [production]
14:11 <moritzm> installing aspell security updates [production]
14:11 <otto@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
14:07 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' . [production]
14:07 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' . [production]
14:03 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' . [production]
14:03 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' . [production]
14:00 <otto@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' . [production]
13:59 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . [production]
13:59 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' . [production]
13:54 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . [production]
13:54 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' . [production]
13:52 <otto@deploy1002> helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . [production]
13:42 <volans@deploy1002> Finished deploy [netbox/deploy@660ad14]: Deploy v2.10.4-wmf5 (duration: 02m 29s) [production]
13:40 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' . [production]
13:39 <volans@deploy1002> Started deploy [netbox/deploy@660ad14]: Deploy v2.10.4-wmf5 [production]
13:36 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' . [production]
13:34 <otto@deploy1002> helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics' for release 'canary' . [production]
13:30 <ottomata> deploying eventgate-analytics with native prometheus support. Doing this slowly on canary release first to ensure https://wikitech.wikimedia.org/wiki/Incident_documentation/2021-07-14_eventgate-analytics_latency_spike_caused_MW_app_server_overload is fixed. [production]
13:29 <otto@deploy1002> helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics' for release 'canary' . [production]
12:56 <elukey> created component/iptables185 for buster-wikimedia + imported packages from buster-backports [production]
12:50 <dcausse@deploy1002> Finished deploy [wikimedia/discovery/analytics@346ac10]: (no justification provided) (duration: 06m 13s) [production]
12:43 <dcausse@deploy1002> Started deploy [wikimedia/discovery/analytics@346ac10]: (no justification provided) [production]
11:23 <Lucas_WMDE> EU backport+config window done [production]
11:20 <oblivian@deploy1002> Synchronized debug.json: Config: [[gerrit:708255|Add the experimental kubernetes backend to mwdebug (T283056)]] (duration: 00m 56s) [production]
11:10 <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704456|Add stream configuration for ContentTranslation events (T281982)]] (duration: 00m 58s) [production]
10:39 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mw1269.eqiad.wmnet [production]
10:16 <jelto> gitlab-ansible playbook on gitlab2001.wikimedia.org END (PASS) [production]
10:11 <mutante> replacing scap proxies: mw1269 with mw1420, mw1285 with mw1306 [production]
10:10 <marostegui@cumin1001> dbctl commit (dc=all): 'db2147 (re)pooling @ 100%: After mariadb restart and upgraed', diff saved to https://phabricator.wikimedia.org/P16909 and previous config saved to /var/cache/conftool/dbconfig/20210727-101053-root.json [production]