3601-3650 of 10000 results (77ms)
2022-09-15 ยง
17:39 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
17:39 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
16:17 <andrew@cumin1001> END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging BryanDavis out of all services on: 2047 hosts [production]
16:16 <andrew@cumin1001> START - Cookbook sre.idm.logout Logging BryanDavis out of all services on: 2047 hosts [production]
15:39 <cwhite@cumin2002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
15:37 <cwhite@cumin2002> START - Cookbook sre.dns.netbox [production]
15:28 <hnowlan@puppetmaster1001> conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=eqiad [production]
15:27 <hnowlan@deploy1002> helmfile [eqiad] DONE helmfile.d/services/sessionstore: sync [production]
15:27 <hnowlan@deploy1002> helmfile [eqiad] START helmfile.d/services/sessionstore: sync [production]
15:22 <hnowlan> starting cassandra on sessionstore1001-a [production]
15:18 <hnowlan@puppetmaster1001> conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=eqiad [production]
15:11 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1190 (T314041)', diff saved to https://phabricator.wikimedia.org/P34792 and previous config saved to /var/cache/conftool/dbconfig/20220915-151131-ladsgroup.json [production]
14:56 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P34791 and previous config saved to /var/cache/conftool/dbconfig/20220915-145625-ladsgroup.json [production]
14:41 <moritzm> installing libtirpc security updates [production]
14:41 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P34790 and previous config saved to /var/cache/conftool/dbconfig/20220915-144118-ladsgroup.json [production]
14:26 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1190 (T314041)', diff saved to https://phabricator.wikimedia.org/P34789 and previous config saved to /var/cache/conftool/dbconfig/20220915-142612-ladsgroup.json [production]
14:01 <sukhe> retarting bird.service on A:dns-auth for zlib update [production]
14:00 <urbanecm@deploy1002> Synchronized wmf-config/InitialiseSettings.php: 6b9784a0708cf1e7762034ccfba7e5604b2f6dc2: Enable the Vue version of the mentee overview in pilot wikis (T300532) (duration: 03m 45s) [production]
13:58 <aqu@deploy1002> Finished deploy [airflow-dags/analytics@b9be20d]: Regular analytics weekly train [airflow-dags@b9be20d] (duration: 00m 09s) [production]
13:58 <aqu@deploy1002> Started deploy [airflow-dags/analytics@b9be20d]: Regular analytics weekly train [airflow-dags@b9be20d] [production]
13:57 <sukhe> retarting haproxy.service on A:dns-auth for zlib update [production]
13:57 <aqu@deploy1002> Finished deploy [airflow-dags/analytics_test@b9be20d]: Regular analytics weekly train TEST [airflow-dags@b9be20d] (duration: 00m 10s) [production]
13:56 <aqu@deploy1002> Started deploy [airflow-dags/analytics_test@b9be20d]: Regular analytics weekly train TEST [airflow-dags@b9be20d] [production]
13:50 <jayme> updated rsyslog to 8.2208.0-1~bpo11+1 on all kubernetes masters and nodes - T289766 [production]
13:47 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
13:47 <aqu@deploy1002> Finished deploy [analytics/refinery@278c383] (hadoop-test): Regular analytics weekly train TEST (second try after freeing up some disk space) [analytics/refinery@278c383] (duration: 06m 01s) [production]
13:43 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
13:43 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
13:42 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
13:41 <aqu@deploy1002> Started deploy [analytics/refinery@278c383] (hadoop-test): Regular analytics weekly train TEST (second try after freeing up some disk space) [analytics/refinery@278c383] [production]
13:38 <sukhe> restarting bird.service on A:dns-rec for zlib update [production]
13:37 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
13:36 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
13:36 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
13:35 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
13:33 <sukhe> restarting pdns-recursor on A:dns-rec for zlib update [production]
13:33 <urbanecm@deploy1002> Synchronized php-1.39.0-wmf.28/extensions/GrowthExperiments/: f592e85858d17a2de99cde93627054ee4972c2bd: Mentee overview: avoid requiring the non-vue mentee overview script when loading the Vue one (T300532) (duration: 04m 05s) [production]
12:50 <elukey@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
12:50 <elukey@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
12:46 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore1001.eqiad.wmnet with reason: temporarily disabled due to sessionstore issues [production]
12:46 <hnowlan@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore1001.eqiad.wmnet with reason: temporarily disabled due to sessionstore issues [production]
12:25 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sessionstore1001.eqiad.wmnet with OS buster [production]
12:17 <jayme> fleet wide update of prometheus-rsyslog-exporter to 0.0.0+git20201008-4 - T289766 [production]
12:10 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore1001.eqiad.wmnet with reason: host reimage [production]
12:06 <hnowlan@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore1001.eqiad.wmnet with reason: host reimage [production]
12:00 <marostegui@cumin1001> dbctl commit (dc=all): 'db2131 (re)pooling @ 100%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34787 and previous config saved to /var/cache/conftool/dbconfig/20220915-120013-root.json [production]
11:51 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host sessionstore1001.eqiad.wmnet with OS buster [production]
11:50 <hnowlan@cumin1001> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sessionstore1001.eqiad.wmnet with OS buster [production]
11:45 <hnowlan@cumin1001> START - Cookbook sre.hosts.reimage for host sessionstore1001.eqiad.wmnet with OS buster [production]
11:45 <marostegui@cumin1001> dbctl commit (dc=all): 'db2131 (re)pooling @ 75%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34786 and previous config saved to /var/cache/conftool/dbconfig/20220915-114508-root.json [production]