651-700 of 10000 results (77ms)
2023-04-04 ยง
13:34 <sukhe> run authdns-update for CR 905612, reverting depool of eqiad [production]
13:30 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes; selector: name=thumbor1006.eqiad.wmnet [production]
13:25 <cgoubert@deploy2002> helmfile [eqiad] DONE helmfile.d/services/mw-web: apply [production]
13:25 <cgoubert@deploy2002> helmfile [eqiad] START helmfile.d/services/mw-web: apply [production]
13:13 <hnowlan@puppetmaster1001> conftool action : set/pooled=no; selector: name=thumbor1006.eqiad.wmnet [production]
13:11 <hnowlan@puppetmaster1001> conftool action : set/pooled=no; selector: name=maps1009.eqiad.wmnet [production]
13:11 <XioNoX> asw2-c-eqiad> request system reboot all-members - T331882 [production]
13:10 <urbanecm@deploy2002> Finished scap: Backport for [[gerrit:905544|ckbwiktionary: Add logo (T331831)]] (duration: 07m 00s) [production]
13:05 <akosiaris@cumin1001> END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all active/active services in eqiad: eqiad row C switches upgrade - T331882 [production]
13:03 <urbanecm@deploy2002> Started scap: Backport for [[gerrit:905544|ckbwiktionary: Add logo (T331831)]] [production]
13:02 <ayounsi@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 227 hosts with reason: eqiad row C upgrade [production]
12:57 <ayounsi@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on 227 hosts with reason: eqiad row C upgrade [production]
12:57 <steve_munene> putting pdfs into safe mode as part of T331882 [production]
12:52 <ayounsi@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on 228 hosts with reason: eqiad row C upgrade [production]
12:52 <ayounsi@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on 228 hosts with reason: eqiad row C upgrade [production]
12:44 <akosiaris@cumin1001> START - Cookbook sre.discovery.datacenter depool all active/active services in eqiad: eqiad row C switches upgrade - T331882 [production]
12:43 <Emperor> depool thanos-fe1003 re T331882 [production]
12:38 <Emperor> depool ms-fe1011 re T331882 [production]
12:32 <sukhe> [finished] run authdns-update for CR: 905603 depool eqiad [production]
12:31 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 38 hosts with reason: Row c switch maint T331882 [production]
12:31 <sukhe> run authdns-update for CR: 905603 depool eqiad [production]
12:31 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 6:00:00 on 38 hosts with reason: Row c switch maint T331882 [production]
12:28 <stevemunene@puppetmaster1001> conftool action : set/pooled=no; selector: name=aqs1018.eqiad.wmnet [production]
12:28 <volans@cumin1001> END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling update on A:netbox [production]
12:28 <stevemunene@puppetmaster1001> conftool action : set/pooled=no; selector: name=aqs1013.eqiad.wmnet [production]
12:28 <volans@cumin1001> START - Cookbook sre.netbox.update-extras rolling update on A:netbox [production]
12:28 <stevemunene@puppetmaster1001> conftool action : set/pooled=no; selector: name=aqs1012.eqiad.wmnet [production]
12:28 <volans@cumin1001> END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling update on A:netbox-canary [production]
12:27 <volans@cumin1001> START - Cookbook sre.netbox.update-extras rolling update on A:netbox-canary [production]
12:26 <stevemunene@puppetmaster1001> conftool action : set/pooled=no; selector: name=datahubsearch1003.eqiad.wmnet [production]
12:24 <TimStarling> I noticed that mw2382 was still talking to mwlog1002. It still had old php-fpm7.4 processes despite the scap. So I manually restarted php-fpm on it. [production]
12:17 <tstarling@deploy2002> Synchronized src/Profiler.php: T331882 disable profiling for switch maintenance (duration: 05m 58s) [production]
11:35 <hnowlan@puppetmaster1001> conftool action : set/pooled=inactive; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet [production]
11:24 <moritzm> installing joblib security updates [production]
10:17 <hnowlan@puppetmaster1001> conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet [production]
09:51 <hashar@deploy2002> rebuilt and synchronized wikiversions files: Revert "group0 wikis to 1.41.0-wmf.3" | T330209 [production]
09:42 <hashar@deploy2002> rebuilt and synchronized wikiversions files: group0 wikis to 1.41.0-wmf.3 refs T330209 [production]
09:20 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1201 (T333332)', diff saved to https://phabricator.wikimedia.org/P46025 and previous config saved to /var/cache/conftool/dbconfig/20230404-091639-ladsgroup.json [production]
09:19 <hashar@deploy2002> Pruned MediaWiki: 1.41.0-wmf.1 (duration: 02m 16s) [production]
09:12 <hashar@deploy2002> Finished scap: testwikis wikis to 1.41.0-wmf.3 refs T330209 (duration: 40m 20s) [production]
09:09 <moritzm> installing libmicrohttpd security updates [production]
09:07 <moritzm> installing libdatetime-timezone-perl updates [production]
09:04 <akosiaris@deploy2002> helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'. [production]
09:04 <akosiaris@deploy2002> helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'. [production]
09:04 <akosiaris@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'. [production]
09:04 <akosiaris@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'. [production]
09:03 <akosiaris@deploy2002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'. [production]
09:03 <akosiaris@deploy2002> helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'. [production]
09:03 <akosiaris@deploy2002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
09:03 <akosiaris@deploy2002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]