2351-2400 of 10000 results (62ms)
2022-07-13 ยง
17:48 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudnet1006.eqiad.wmnet with OS bullseye [production]
17:48 <cmjohnson@cumin1001> START - Cookbook sre.hosts.reimage for host cloudnet1006.eqiad.wmnet with OS bullseye [production]
16:17 <milimetric@deploy1002> Finished deploy [airflow-dags/analytics@e58e61d]: (no justification provided) (duration: 00m 10s) [production]
16:17 <milimetric@deploy1002> Started deploy [airflow-dags/analytics@e58e61d]: (no justification provided) [production]
15:59 <bking@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host elastic2040.codfw.wmnet with OS bullseye [production]
15:58 <elukey@deploy1002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' . [production]
15:58 <elukey@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
15:58 <elukey@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
15:56 <bking@cumin1001> START - Cookbook sre.hosts.reimage for host elastic2040.codfw.wmnet with OS bullseye [production]
15:21 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
15:20 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
15:20 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
15:19 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
15:12 <aqu@deploy1002> Finished deploy [airflow-dags/analytics@9edd1ab]: Deploy [airflow-dags/analytics@9edd1ab] (duration: 00m 10s) [production]
15:12 <aqu@deploy1002> Started deploy [airflow-dags/analytics@9edd1ab]: Deploy [airflow-dags/analytics@9edd1ab] [production]
15:10 <aqu@deploy1002> Finished deploy [airflow-dags/analytics_test@9edd1ab]: Deploy [airflow-dags/analytics_test@9edd1ab] (duration: 00m 08s) [production]
15:10 <aqu@deploy1002> Started deploy [airflow-dags/analytics_test@9edd1ab]: Deploy [airflow-dags/analytics_test@9edd1ab] [production]
14:52 <bking@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host elastic2049.codfw.wmnet with OS bullseye [production]
14:38 <bking@cumin1001> START - Cookbook sre.hosts.reimage for host elastic2049.codfw.wmnet with OS bullseye [production]
14:34 <aqu@deploy1002> Finished deploy [airflow-dags/analytics_test@03c1a05]: Deploy [airflow-dags/analytics_test@03c1a05] (duration: 00m 12s) [production]
14:34 <aqu@deploy1002> Started deploy [airflow-dags/analytics_test@03c1a05]: Deploy [airflow-dags/analytics_test@03c1a05] [production]
14:18 <aqu> Deployed refinery using scap, then deployed onto hdfs [production]
14:11 <bking@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host elastic2049.codfw.wmnet with OS bullseye [production]
14:08 <aqu@deploy1002> Finished deploy [analytics/refinery@bd39e67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@bd39e67] (duration: 07m 42s) [production]
14:04 <bking@cumin1001> START - Cookbook sre.hosts.reimage for host elastic2049.codfw.wmnet with OS bullseye [production]
14:01 <aqu@deploy1002> Started deploy [analytics/refinery@bd39e67] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@bd39e67] [production]
14:00 <aqu@deploy1002> Finished deploy [analytics/refinery@bd39e67] (thin): Regular analytics weekly train THIN [analytics/refinery@bd39e67] (duration: 00m 07s) [production]
14:00 <aqu@deploy1002> Started deploy [analytics/refinery@bd39e67] (thin): Regular analytics weekly train THIN [analytics/refinery@bd39e67] [production]
13:47 <bking@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host elastic2049.codfw.wmnet with OS bullseye [production]
13:44 <marostegui@cumin1001> dbctl commit (dc=all): 'Remove weight from x1 master', diff saved to https://phabricator.wikimedia.org/P31037 and previous config saved to /var/cache/conftool/dbconfig/20220713-134413-marostegui.json [production]
13:37 <bking@cumin1001> START - Cookbook sre.hosts.reimage for host elastic2049.codfw.wmnet with OS bullseye [production]
13:20 <Lucas_WMDE> UTC afternoon backport window done [production]
13:20 <bking@cumin1001> END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host elastic2049.codfw.wmnet [production]
13:18 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
13:17 <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:790399|Configure wgLexemeLexicalCategoryItemIds on Wikidata (T307441)]] (duration: 02m 45s) [production]
13:17 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
13:17 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
13:16 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
13:10 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
13:10 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
13:10 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
13:09 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
13:08 <lucaswerkmeister-wmde@deploy1002> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:813594|Configure $wgBabelCategoryNames on Test Wikidata (T312920)]] (duration: 02m 51s) [production]
13:05 <inflatador> bking@elastic2049 rebooting for read-only fs [production]
13:04 <bking@cumin1001> START - Cookbook sre.hosts.reboot-single for host elastic2049.codfw.wmnet [production]
12:49 <damilare> payments-wiki upgraded from 2f95d8b4 to 6a8aa302 [production]
12:12 <moritzm> draining ganeti2028 T311686 [production]
12:08 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on ganeti2018.codfw.wmnet with reason: Remove node for eventual reimage, T311686 [production]
12:08 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on ganeti2018.codfw.wmnet with reason: Remove node for eventual reimage, T311686 [production]
11:43 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 15 hosts with reason: codfw s8 sanitarium master switch [production]